Full-text mining of MIT thesis content: Help us experiment | MIT Libraries News

“Curious about text mining? So are we! The MIT Libraries is exploring what a text mining service for our thesis collection could look like. What does that mean? Using a simple interface or by building your own tool using an API, you can search and download the full text of theses and dissertations published at MIT, and then use that content for your own further research and analysis.

So we’re building a prototype API to experiment, and here’s where we need you: We want to know more about how researchers might use such a service and what it should include. Do you do full text data mining as part of your research? Do you use other services like this and have opinions about them? Want to help us test our prototype?…”

Data Access :: SPARC

“The SPARC [Stratosphere-Troposphere Processes and their Role in Climate] Data Centre provides open access to most data sets hosted on the ftp server. Ftp access and a description of each data set, including in some cases key findings and references, is provided through the links below. Open access is provided for the benefit of the SPARC scientific community, to promote new research and applications beyond the scope of the original studies, and to provide transparency and reproducibility of published results. Certain data sets require user agreements while others remain restricted to access by SPARC project scientists pending publication of associated analysis….”

Open data et décision de justice « anonyme », un mariage impossible ?

From Google’s English: “The question of the merits of disseminating the names of parties in court decisions has never been taken seriously by the public authorities. The online dissemination of jurisprudence on various websites, public (Legifrance) and private (sometimes greedy appetites), coupled with the power of new search engines, has in particular made the anonymization of court decisions a matter fundamental.”

Scott Pruitt and Anti-Science Activity at the EPA

“While it appears that exposure by the news media has prompted the administration to at least temporarily rescind its order to remove Web content on climate change, there is no guarantee that new orders will not emerge unless we have pledges from Mr. [Scott] Pruitt [Trump’s nominee to head the Environmental Protection Agency] to safeguard public access to scientific information about climate change and other issues. Indeed, several climate change–related Web pages and reports have been removed from the State Department website. Public servants should be free to state simple scientific facts. Americans have the right to see and benefit from taxpayer-funded research, and scientists have the right to share their findings openly and honestly, without political pressure, manipulation, or suppression. Political staff should never be in charge of deciding what scientific conclusions the public is allowed to see….”

What should we do now Beall’s List has gone?

“It’s now been widely discussed that Jeffrey Beall’s list of predatory and questionable open-access publishers — Beall’s List for short — has suddenly and abruptly gone away. No-one really knows why, but there are rumblings that he has been hit with a legal threat that he doesn’t want to defend.

To get this out of the way: it’s always a bad thing when legal threats make information quietly disappear; to that extent, at least, Beall has my sympathy.

That said — over all, I think making Beall’s List was probably not a good thing to do in the first place, being an essentially negative approach, as opposed to DOAJ’s more constructive whitelisting approach. But under Beall’s sole stewardship it was a disaster, due to his well-known ideological opposition to all open access. So I think it’s a net win that the list is gone.

But, more than that, I would prefer that it not be replaced….”

Freeze on Federal Activities Gives Scientists a Chill – The Chronicle of Higher Education

“Administration officials initially spoke of a freeze on research grants at several agencies, including the EPA and Department of Agriculture, along with a ban on tweets or other social-media comments by agency officials, and a halt in any regulatory changes, even those already approved.

Then, under pressure from members of Congress and an array of critics outside the government, some of those changes are reportedly getting walked back, now described as temporary, or clarified to be less sweeping than they initially appeared.

The EPA made clear that current grants would not be blocked, and that plans to delete climate-change references from the agency’s website had been reversed. The administration’s nominee for secretary of commerce, Wilbur Ross, said that scientists at the National Oceanic and Atmospheric Administration would be allowed to publicly share peer-reviewed findings….”

How Frankenstein helps a scientist think about his research.

“Technological hubris is ignoring the suggestions of others—even if only by neglecting to inform them of an advance. It is most common among those suffering from the curse of knowledge: scientists.

That’s why my colleagues and I seek to ensure that all gene drive research takes place in the open light of day. People deserve a voice in decisions that might affect them, and building gene drive systems behind closed doors denies them that opportunity. Even apart from the moral hazard, keeping research plans secret—as the current scientific enterprise incentivizes us to do—is appallingly inefficient and outright dangerous. It doesn’t just slow the rate of advances, thereby jeopardizing our ability to sustain our civilization; it practically invites global catastrophic risk. No one, be they science-fiction author or Austin Burt himself, anticipated a form of gene drive as versatile as is theoretically enabled by CRISPR. What else have we not anticipated that this time might be truly dangerous? And given this possibility, why on earth do we send out small teams of ultra-specialists, mostly working on their own and in secret, to find and open every technological box they can? Better to default to open research plans, enabling diverse teams to evaluate new advances, implementing measures to obscure and counter anything deemed truly dangerous, than to proceed blindly.

Of course, any wholesale restructuring of the scientific enterprise would also be an act of reckless hubris. My personal rule of ecological engineering: start local and scale up only if warranted. In this case, the best “local test” is the field of gene drive research. Scientific journals, funders, policymakers, and intellectual property holders should change the incentives to ensure that all proposed gene drive experiments are open and responsive.

The message from fiction and reality is clear: Scientists should hold themselves morally responsible for all consequences of their work. The least we can do is muster enough humility to ask for help….”

EPA Scientists’ Work May Face ‘Case By Case’ Review By Trump Team, Official Says : The Two-Way : NPR

“Scientists at the Environmental Protection Agency who want to publish or present their scientific findings likely will need to have their work reviewed on a “case by case basis” before it can be disseminated, according to a spokesman for the agency’s transition team….”

Trump orders federal agencies to stop communicating with the public. From +B…

“According to an email sent Monday morning and obtained by BuzzFeed News, the [US Department of Agriculture] told staff — including some 2,000 scientists — at the agency’s main in-house research arm, the Agricultural Research Service (ARS), to stop communicating with the public about taxpayer-funded work….”

Open Access, Academia.edu, and why I’m all-in on Zenodo.org | Pocket Change

“Migrating from Academia.edu to Zenodo.org

I fully advocate leaving Academia.edu, but what purpose does it serve to simply delete your account? You are removing publications that are, in the very least, freely and openly available at the moment. Essentially, the best decision is to migrate documents to Zenodo.org, and allow at least one week for Google to fully index migrated content before deleting the Academia.edu account. My MA thesis entitled ‘Recent Advances in Roman Numismatics,’ about the application of Linked Open Data methodologies toward Roman numismatics with Nomisma.org and Online Coins of the Roman Empire, had been available in both the ANS Digital Library and Academia.edu as of January 28, 2016. Due to our superior use of microdata and full-text indexing, the ANS Digital Library version surpassed Academia days after it was published. I uploaded my thesis to Zenodo.org January 29, 2016, and it was already on the first page of Google three days later.

Many of us have uploaded a substantial number of documents to Academia.edu, and it might be tedious to re-upload these documents into a new system, especially with regard to re-entering publication metadata. I have sought to rectify this by facilitating a more efficient migration system. I have developed a framework that is capable of parsing metadata from an Academia.edu profile (although not all publications are listed when the profile page loads), accepting re-uploaded documents (since these cannot be extracted from Academia.edu directly), and uploading these contents into Zenodo.org. This framework itself is open source and available on Github. I will save the technical discussion for different venue.”

[GOAL] Elsevier as an open access publisher


reading the discussion about Elsevier as an ‘OA publisher’ and the discussion about CC-BY as an ‘requirement’ for OA we analysed the Elsevier metadata in Crossref.

Harvesting the data some days ago the most frequently used license information were:

675,343 : http://www.elsevier.com/open-access/userlicense/1.0/

191,530 : http://creativecommons.org/licenses/by-nc-nd/4.0/

122,013 : http://creativecommons.org/licenses/by-nc-nd/3.0/

The first one is not CC-BY but according to https://www.elsevier.com/about/company-information/policies/open-access-licenses the users at our universities have access to these articles, and that´s what counts I would say.

Out of about 15,2 million Elsevier article metadata about 989,000 metadata records point to free accessible articles.

I don´t want to judge these numbers, but I have heard of publishers, that have 100% OA.