TDM at European Parliament – tweet-like report

Great meeting at Brussels EP yesterday. Would have liked to tweet but didn’t have password. – There *were* tweets by the MEPs. So I wrote my notes like tweets I would have made.  Maye be useful to some, mystifying to others…


Also Julia Reda MEP was there at the start!

Here’s the panel (7-8) run by Catherine Stihler MEP (who chaired well and let everyone else speak)

Marco Giorello Head copyright Unit DG Connect

Problem: data analytics techniques involve making copies
These copies are relevant to copyright
Legal situation unclear;  some exceptions temporal copying, and copying for research purposes
(a) contractual conditions and policies
(b) legislation – UK exception – because there was already research exception (but leads to Euro fragmentation).
Other states have “research exception”. Other states e.g. France, and ?Germany we don’t want 15 different legislations
Dec 2015 – EC trying to find balance – PIRO [Public Interest Research Organization, yes I don’t what that is either, so asked later…] – to address Univs and research insts.
But aware that Univs have private partners
UK “non-commercial” has caused problems.
Not only about copyright – but also technology , standards …

John Boswell SAS (software company) – analysis of data.
TDM is just one form of data analysis. Copyright wider, bcos movies, images, voice all covered by copyright
analysis of 1 million docs to extract sentiment and time series, does not implicate (C).
(C) is protection of expression of an idea. Analysing this does not copy the expression or create a derivative work.(C) must not prevent TDM. Issue much bigger than Universities. World has so much (C) – ca 300, 000 every minute FB, Tweets, Instagram, etc. . Much covered by copyright
Analysis of social media is major good. Govs can use social media to predict economics
Debate must realise that TDM does  not implicate (C)

Theresa Comodini Cachia (MEP and meeting convener)

Don’t wish to have debate on copyright vs TDM
Startups need protection from copyright and also need to use TDM
Startup innovation are EU priority – social and economic development
TDM will lead to new economic development
Reda report focussed on academic reearch.
innovation not just economic but also health and social
would give good push to innovation

Jakub Czakon (Stermedia) – (data analyst Physics + finance + chess)
loves data
TDM = data -> information -> knowledge
example s/w that matches CVs onto job offers
extract important info from data
try to match qualifications- find connections and distances between documents
health care – diagnosis of tumour – used machine learning and public data – found public competition training set.
looks for cells and local structure. Created diagnostic indicators.
facial recognition
these skills and startups are critical for Europe

Adriana Homolova – data journalist and visualisation
dataScience >> data analysis (insight into data) >> data analytics (analysing large amounts of data) >> data mining
uses AI.
NeuralNets, RandomForests, NearestNeighbours
Data mining is starting in journalism
journalism qualitative vs quantitative – “Interview data”
makes journalism stronger
data analysis used to fliter professors for side jobs for “interesting people”
e.g. 3 side jobs per prof
BBC analysed tennis for match-fixing for repeated underperforming
published on github
revolutionary in journalism
Panama papers had 400 (competing) journalists to abandon secrecy “newsroom collaboration”
data are the raw material of our age.
copyright can do much harm.
data anslytics are extension of our thought proceses
we must look how to open up – e.g. copyleft

Jean-Francois Dechamp DG Research and Innovation
both policy creation and funding agency
FutureTDM and OpenMinTed
objective – best conditions to do their job
resarchers and both producers and consumers
researchers often don’t own copyright of their resaerch
competition fierce – merger of Springer and Nature
data journals
publishers => service providers

Sergey Filippov Lisbon Council (Brussels Innovation Think tank)
Report 2 years ago on TDM in Academic and Research Communities in Europe
Academic pubs 1.5 / year , 60 million in total
“Publish or perish” leads to distraction from teaching and poor research
Traditional k/w search, TDM can recognise concept s, facts realtions, preparatory
idea -> lit rev  (TDM)-> hypothesis (TDM) -> data methodology -> analys conclusions
what’s problem? copyright …
researched this…
scientific publications 1200 pubs 47% from US EU 26% EU cited less than US
applicable to all subjects, not just hard sciences
10-fold increase in Data mining, TDM papers in last 5 years
US 21%, EU 28, CN 10, IN 13%
Patents in data mining huge growth in China
Then he interviewed 20 researchers
most people don’t know about TDM or tech -savvy
many worried about copyright
leads to results of lower quality
academic want exceptions
growth in CN and IN and US
Europeans concerned but worried about clarity
if we don’t manage to get TDM used, then far-reaching negative implications for EU


Christoph Bruch: Open Science Coordination Office of the Helmholtz association,

lot of researchers want assurance
Must not be universities only
(to  Marco EC) must not limit how society can use information
limit will do very much damage

Marco – commercial vs nc. Current draft is not final.
Why not business activities. Exception would also be (C) but certan classes of beneficiaries.
must look at (C) with care
cause friction
Pharma already use licences
Existing lucrative Market for re-use so EC can’t easily sweep it away
attempt to give full legal certainty
will be positive for academia and neutral for others

Boswell SAS – there is broad exception for TDM as “fair use” if not used for other purpose
interim step – new work is not copy of expression
in EC temporary copy should be covered by 5.1 of InfoSoc directive
PPIs with universities – lines are blurred
Should not make lines between univs and others

PM-R gave TDMer point of view and asked about PIRO – more later