Open… and Yet Shut
Filed Under: Big Data, Blog, data analytics, healthcare, Industry Analysis, internet, Publishing, Reed Elsevier, Search, semantic web, STM, Uncategorized, Workflow
I am a passionate fan of the World of Open. Without a network ethic of Open we will strangle the world of connectivity with restriction and force ourselves into re-regulating just those things which we have so recently decongested. Yet not all the things we want to designate as Open will be open just because we put the word in front of the previous descriptor. In most instances being Open does not necessarily do the trick on its own – someone has to add another level of value to change Open meaning “available” to Open meaning “useful”. So, the argument that Open Access journals are an appropriate response to the needs of researchers and to a wider public who should be enjoying unfettered access to the fruits of taxpayer-funded research would seem to be a given. But creating real benefits beyond such general statements of Good seems very hard, and the fact that researchers cannot see tangible benefits beyond occupying the moral high ground probably connects with the grindingly slow advance of Open Access to around a quarter of the market in a decade.
I feel much the same about Open Citations. Reading the latest Initiative for Open Citations report at i4Oc.org, I find really good things:
“The present scholarly communication system inadequately exposes the knowledge networks that already exist within our literature. Citation data are not usually freely available to access, they are often subject to inconsistent, hard-to-parse licenses, and they are usually not machine-readable.”
And I see an impressive number of members, naturally not including either Clarivate or Elsevier. Yet using the argument above I would say that either of these is most likely to add real value to Open Citations, and certainly more likely than most of the members of the club. What we have here, all the time, is a constant effort to try to emulate a world which has largely now passed by, and we do it by trying to build value propositions from wholly owned artifacts or data elements, thus turning them into proprietory systems. This urge to monopoly is clearly being superseded: what has happened is that the valuation systems by which markets and investors measure value has not caught up with the way users acknowledge value.
Outside of the worlds of research and scholarly communication it seems clear that the most impressive thing you can say about the world of data is “Use it and lose it”. The commoditization of data as content is evident everywhere. The point about data is not Big Data – a once prominent slogan that has now diminished into extinction – but actionable data. The point is not collecting all the data into one place – it can stay wherever it rests as long as we can analyse and utilise it. The point is the level of analytical insight we can achieve from the data available, and this has much to do with our analytics, which is were the real value lies. Apply those proprietory analytics back into the workflow of a particular sector – the launch music around Artifacts in Healthcare in Cambridge MA was ver notceable last week – and value is achieved for an information system. And one day, outside of copyright and patents, and before we get to the P&L account, someone will work out how we index those values and measure the worth of a great deal of the start-up activity around us.
So from this viewpoint the press release of the week came from Clarivate Analytics and did not concern Open at all directly. It concerned a very old-fashioned value indeed – Brand. If the world of scholarly communication is really about creating a reputation marketplace, the ISI, Eugene Garfield’s original vehicle for establishing citation indexing from which to promulgate the mighty Impact Factor, is the historical definition point of the market in scholarly reputation. By refounding it and relaunching it, Clarivate seem to me to be not just saying how much the market needs that sort of research right now, but to be aiming at the critical value adding role: using the myriad of data available to redefine measurement of reputation in research. In many ways Open Citations will assist that, but the future will be multi-metric, the balance of elements in the analytics will be under greater scrutiny than ever before, and ISI will need to carry the whole marketplace with them to achieve a result. That is why you need a research institute, not just a scoring system. And even then the work will need a research institute to keep it in constant revision – unlike the impact factor the next measure will have to be developed over time, and keep developing so that it cannot be influenced or “gamed”. In the sense I have been using it here, ISI becomes the analytical engine sitting on top of all of the available but rapidly commoditising research data.
We have come very quickly from a world where owning copyrights in articles and defending that ownership was important, to this position of commoditized content and data as a precursor to analysis. But we must still be prepared for further shortening of the links and cycle re-adjustments. Citations of evidential data, and citations on Findings-as-data without article publishing will become a flood rather than the trickle it is now. Add in the vast swathes of second tier data from article publishing in India, China or Brazil. Use analytics not just for reputational assessment, but also for trend analysis, repeat experiment verification and clinical trials validation. We stand in a new place, and a re-engineered ISI is just what we need beside us.
Originally published on davidworlock.com On 14th February 2018