Content Mining Myth Busting 0: “It doesn’t matter to me”

In the next few posts I shall address some common myths about Content Mining (TDM). Many are implicitly or explicitly put to by Toll-Access Publishers (TAPublishers).

The most serious myth is that it’s not important.

Actually it’s important to everyone. The two major information successes of the first decade of this century were both content-mining:

  • Google has systematically mined the Open Web using machines and added its own semantics
  • Wikipedia has systematically mined the info sphere using humans and added its wn semantics.

If you have ever used Google or ever used Wikipedia then you have used the results of content-mining.

Wikipedia is beyond criticism – if you are unhappy about it, get involved and change it. But what about Google.?

Well Google doesn’t do science.

If I want to know what species was recorded in this place at that date; or what chemical reaction occurred under these conditions, then Google doesn’t help. You need a semantic scientific search engine.

Discipline-based Semantic content mining  is the most important development in applied information science. If you want to build the library of the future you should be doing this – not paying rent to third parties. If you want to do multidisciplinary research you need the results of content-mining.

If we were allowed to do it, then I wouldn’t be wring this blog post. As it is, the TAPublishers are fighting tooth-and-nail to stop us content-mining.  People are doing it but in secret. Because if they do it in public, then they will be cut off or sued. It’s not surprising that we don’t yet have  high visibility.

But that’s going to change. And change rapidly. We have literally billions of dollars of information locked up in the current scholarly literature. And 10000 papers come out each day. We need content mining to manage these – read them for us. Organize them. Let us search after we’ve read them. Do some of our routine thinking for us.

On our own terms for our own needs.

It can happen, just as Wikipedia happened.

So don’t turn away – believe that Content Mining matters – matters massively.