Jump to content

Wikipedia talk:WikiProject Molecular Biology/Molecular and Cell Biology/Archive 10

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 5Archive 8Archive 9Archive 10Archive 11

Facto Post – Issue 15 – 21 August 2018

Facto Post – Issue 15 – 21 August 2018

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Neglected diseases
Anti-parasitic drugs being distributed in Côte d'Ivoire
What's a Neglected Disease?, ScienceSource video

To grasp the nettle, there are rare diseases, there are tropical diseases and then there are "neglected diseases". Evidently a rare enough disease is likely to be neglected, but neglected disease these days means a disease not rare, but tropical, and most often infectious or parasitic. Rare diseases as a group are dominated, in contrast, by genetic diseases.

A major aspect of neglect is found in tracking drug discovery. Orphan drugs are those developed to treat rare diseases (rare enough not to have market-driven research), but there is some overlap in practice with the WHO's neglected diseases, where snakebite, a "neglected public health issue", is on the list.

From an encyclopedic point of view, lack of research also may mean lack of high-quality references: the core medical literature differs from primary research, since it operates by aggregating trials. This bibliographic deficit clearly hinders Wikipedia's mission. The ScienceSource project is currently addressing this issue, on Wikidata. Its Wikidata focus list at WD:SSFL is trying to ensure that neglect does not turn into bias in its selection of science papers.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 13:23, 21 August 2018 (UTC)

Expert attention

This is a notice about Category:Molecular and Cell Biology articles needing expert attention, which might be of interest to your WikiProject. It will take a while before the category is populated. There might be as few as one page in the category, or zero if someone has removed the expert request tag from the page.  — Mr. Guye (talk) (contribs)  22:31, 9 September 2018 (UTC)

@Mr. Guye: Thanks! I've just also found two other categories. how easy would these be to merge?
T.Shafee(Evo&Evo)talk 23:59, 9 September 2018 (UTC)

Nonglobular Protein Taskforce

There was an editathon at the recent NGP-Net conference for nonglobular protein science, which is affiliated with a European consortium/scientific society. There was interest from the student network in sponsoring regular editathons and curating some pages from our topic area. Many of the pages that were identified as needing improvement are within the bounds of WP:MCB (or occasionally WP:WCB), so it would make sense organize within this project.

Some other projects have the concept of a task force (example) for organizing sub-communities with shared interests. Technically, this consists of a slightly modified banner on the talk page and a set of categories so that task force members can identify pages requiring improvement. Would this community be supportive of starting a task force relating to nonglobular proteins?

Example topics to include would be intrinsically disordered proteins, protein tandem repeats, amyloid, low-complexity_regions, and methods for investigating such topics.

-- Quantum7 07:17, 14 September 2018 (UTC)

@Quantum7: Great news. Might it be worth running it on this talk page in stead of spinning off a taskforce (to reduce admin overhead)? A separate taskforce page could be later separated off if the discussion traffic becomes too overwhelming here. Either way, I'm happy to help out the venture in whatever way is most useful. I work in nonglobular proteins (cysteine-rich proteins and hydroxyproline-rich glycoproteins) which are both redlinks that have been in the back of my mind for a while. Several nonglobular protein superfamilies (e.g defensins) are also in a sorry state. T.Shafee(Evo&Evo)talk 13:41, 14 September 2018 (UTC)

Circulating mitochondrial DNA

I have just created a stub on circulating mitochondrial DNA, which is a topic with over 800 Google Scholar hits and a breathless writeup on September 13 in Scientific American. If anybody want to improve my terrible stub, please do. Abductive (reasoning) 20:34, 20 September 2018 (UTC)

Can anyone on this wikiproject help with the disambiguation links to Adhesin. It is a specialist topic and I don't have the expertise to fix the list here.— Rod talk 16:01, 26 September 2018 (UTC)

I think that the dab page needs to be turned into a broad concept article. Jo-Jo Eumerus (talk, contributions) 16:42, 26 September 2018 (UTC)

Pyruvate kinase

The article on pyruvate kinase says that "In Archaean oceans, phospho-enolpyruvate may have been present abiotically." I doubt that phosphoenolpyruvate is stable in water. Wouldn't it hydrolyze immediately to phosphoric acid and pyruvic acid?? I didn't want to edit the article because I'm not sure of this but it sounds suspicious to me. This was the only place I could find to leave a comment. And I will sign my post with four tildes, if that's what you want.

The mark of the anonymous commenter: 169.229.232.53 (talk) 21:37, 26 September 2018 (UTC)

Facto Post – Issue 16 – 30 September 2018

Facto Post – Issue 16 – 30 September 2018

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

The science publishing landscape

In an ideal world ... no, bear with your editor for just a minute ... there would be a format for scientific publishing online that was as much a standard as SI units are for the content. Likewise cataloguing publications would not be onerous, because part of the process would be to generate uniform metadata. Without claiming it could be the mythical free lunch, it might be reasonably be argued that sandwiches can be packaged much alike and have barcodes, whatever the fillings.

The best on offer, to stretch the metaphor, is the meal kit option, in the form of XML. Where scientific papers are delivered as XML downloads, you get all the ingredients ready to cook. But have to prepare the actual meal of slow food yourself. See Scholarly HTML for a recent pass at heading off XML with HTML, in other words in the native language of the Web.

The argument from real life is a traditional mixture of frictional forces, vested interests, and the classic irony of the principle of unripe time. On the other hand, discoverability actually diminishes with the prolific progress of science publishing. No, it really doesn't scale. Wikimedia as movement can do something in such cases. We know from open access, we grok the Web, we have our own horse in the HTML race, we have Wikidata and WikiJournal, and we have the chops to act.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 17:57, 30 September 2018 (UTC)

Removal of 3DMet data from {{Chembox}}

At the moment, the article about 3DMet (one in the Category:Chemical databases (49)), is up for deletion (AfD). Reason is lack of WP:NOTABILITY, as measured by ~not being referenced in secondary (independent) sources. IOW, virtually no sources refer to it as useful etc. or actually use 3DMet.

If and when this article is deleted, it follows that the {{Chembox}} data row (|3DMet= in {{Chembox}}) should be removed too (we should not link or point to an irrelevant and not-notifyable database). Today, some 126 articles use parameter |3DMet=: [1].

Action needed: The only way to save this information is to prove notability of 3DMet by adding secondary sources (read the AfD though for an investigation already made into this: few sources are sound). -DePiep (talk) 07:23, 19 October 2018 (UTC)

Huh? WP:N has nothing to do with infoboxes. That an article isn't notable by the particular definition used in notability policies does not mean it isn't useful in chemboxes. Jo-Jo Eumerus (talk, contributions) 14:07, 19 October 2018 (UTC)
If the 3DMet database is not noteworthy in Wikipedia, then referring/linking to it is not noteworthy either. How can an irrelevant database be relevant in infoboxes? -DePiep (talk) 14:50, 19 October 2018 (UTC)
Or, the other way around: if there is a useful application of 3DMet data in literature (for some compound), that would be a supporting reference for 3DMet. -DePiep (talk) 14:53, 19 October 2018 (UTC)
"Relevant" in the sense of an infobox and "relevant" in the sense of WP:Notability are not the same thing at all. For example, if the database is frequently used by chemists but seldom discussed. Jo-Jo Eumerus (talk, contributions) 15:07, 19 October 2018 (UTC)
Yes, that is the point: if its usage & virtues are not ending up in sources (publications), it is clearly not relevant. Not the database, not the application wrt a compound. If the database would add really something to some research issue, that creates relevance. In the end: Why have a redlink i.e. nonexistant "3DMet" in the lefthand side of {{Chembox}}, with data (ID) in righthandside? -DePiep (talk) 15:30, 19 October 2018 (UTC)
Talk central is here. I'll reproduce/link your argument there shortly. -DePiep (talk) 19:48, 19 October 2018 (UTC)

Advice on new article: Ragulator-Rag complex

I just added a new article by my class (Ragluator-Rag complex) and was wondering what to do to get it noticed and properly indexed or catagorized.137.142.46.79 (talk) 19:38, 24 October 2018 (UTC) Sorry, I wasn't logged in, Jparcoeur (talk) 19:40, 24 October 2018 (UTC)

@Jparcoeur: You came to the right place. I've made some edits and suggestions on the article's talk page. T.Shafee(Evo&Evo)talk 11:08, 25 October 2018 (UTC)

Does this need to be its own article, or can it be redirected to something else, like saccharide possibly? I don't know enough to tell. ♠PMC(talk) 09:00, 26 October 2018 (UTC)

Facto Post – Issue 17 – 29 October 2018

Facto Post – Issue 17 – 29 October 2018

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Wikidata imaged

Around 2.7 million Wikidata items have an illustrative image. These files, you might say, are Wikimedia's stock images, and if the number is large, it is still only 5% or so of items that have one. All such images are taken from Wikimedia Commons, which has 50 million media files. One key issue is how to expand the stock.

Indeed, there is a tool. WD-FIST exploits the fact that each Wikipedia is differently illustrated, mostly with images from Commons but also with fair use images. An item that has sitelinks but no illustrative image can be tested to see if the linked wikis have a suitable one. This works well for a volunteer who wants to add images at a reasonable scale, and a small amount of SPARQL knowledge goes a long way in producing checklists.

Gran Teatro, Cáceres, Spain, at night

It should be noted, though, that there are currently 53 Wikidata properties that link to Commons, of which P18 for the basic image is just one. WD-FIST prompts the user to add signatures, plaques, pictures of graves and so on. There are a couple of hundred monograms, mostly of historical figures, and this query allows you to view all of them. commons:Category:Monograms and its subcategories provide rich scope for adding more.

And so it is generally. The list of properties linking to Commons does contain a few that concern video and audio files, and rather more for maps. But it contains gems such as P3451 for "nighttime view". Over 1000 of those on Wikidata, but as for so much else, there could be yet more.

Go on. Today is Wikidata's birthday. An illustrative image is always an acceptable gift, so why not add one? You can follow these easy steps: (i) log in at https://tools.wmflabs.org/widar/, (ii) paste the Petscan ID 6263583 into https://tools.wmflabs.org/fist/wdfist/ and click run, and (iii) just add cake.

Birthday logo
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 15:01, 29 October 2018 (UTC)

Editors in this WikiProject may be interested in the featured quality source review RFC that has been ongoing. It would change the featured article candidate process (FAC) so that source reviews would need to occur prior to any other reviews for FAC. Your comments are appreciated. --IznoRepeat (talk) 21:34, 11 November 2018 (UTC)

Infobox gene changes

Yep, still on this old horse. Have posted at Wikiproject Medicine: Wikipedia_talk:WikiProject_Medicine#Help_needed_improving_gene_infobox_(alternate_title:_Tom_(LT)'s_annual_gene_infobox_whinge) --Tom (LT) (talk) 10:39, 12 November 2018 (UTC)

Draft:C3orf67 review

Could somebody take a look at Draft:C3orf67? I have a basic familiarity with the subject matter, but don't know what our notability guidelines are for these sorts of articles. Should this be moved to mainspace? Please leave your comments on the draft. Thanks. -- RoySmith (talk) 17:26, 13 November 2018 (UTC)

Protein-coding genes are considered notable by default but the format is odd. I'd point the author to TBR1 as a template seeing as that is a good article. Jo-Jo Eumerus (talk, contributions) 17:48, 13 November 2018 (UTC)

Infobox gene changes

Yep, still on this old horse. Have posted at Wikiproject Medicine: Wikipedia_talk:WikiProject_Medicine#Help_needed_improving_gene_infobox_(alternate_title:_Tom_(LT)'s_annual_gene_infobox_whinge) --Tom (LT) (talk) 10:39, 12 November 2018 (UTC)

Draft:C3orf67 review

Could somebody take a look at Draft:C3orf67? I have a basic familiarity with the subject matter, but don't know what our notability guidelines are for these sorts of articles. Should this be moved to mainspace? Please leave your comments on the draft. Thanks. -- RoySmith (talk) 17:26, 13 November 2018 (UTC)

Protein-coding genes are considered notable by default but the format is odd. I'd point the author to TBR1 as a template seeing as that is a good article. Jo-Jo Eumerus (talk, contributions) 17:48, 13 November 2018 (UTC)

Topic Page on Selfish genetic element

PLOS Genetics has now joined PLOS Computational Biology in its Topic Pages initiative. As part of this, an article was drafted, peer reviewed and published in PLOS Genetics and has now been copied over to the Selfish genetic element wikipedia page. Comments and suggestions welcome! T.Shafee(Evo&Evo)talk 00:41, 17 November 2018 (UTC)

Large genetics class off the rails

Please see Wikipedia:Education_noticeboard#Large_genetics_class_off_the_rails and the pages linked there, which need checking. Jytdog (talk) 20:11, 18 November 2018 (UTC)

Help with draft on Ferlin proteins

There is a new editor looking for help with a draft, User:Sam orbital/sandbox/Ferlin. While from a layman standpoint, there are no issues with the article, and imho it is ready for mainspace, it could probably do with someone from this project taking a look at it. Thanks in advance for your help. Onel5969 TT me 12:25, 18 November 2018 (UTC)

Thank you User:onel5969 for your time and suggestions, and for submitting the draft for me. I included your suggestions in the newest version and also added a new figure which I meant to add earlier. --Sam orbital (talk) 23:19, 18 November 2018 (UTC)Sam orbital

ZNF385D infobox

I just created ZNF385D and for some reason the infobox isn't working... is it just a lag fetching stuff from Wikidata? Jytdog (talk) 03:14, 21 November 2018 (UTC)

A mapping between ZNF385D and the Wikidata item d:Q18046323 needed to be added . It is unclear to me if there is a bot that is supposed to do this automatically. Boghog (talk) 05:42, 21 November 2018 (UTC)
oh! thanks! Jytdog (talk) 17:24, 21 November 2018 (UTC)

Facto Post – Issue 18 – 30 November 2018

Facto Post – Issue 18 – 30 November 2018
Extended content

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

WikiCite issue

GLAM ♥ data — what is a gallery, library, archive or museum without a catalogue? It follows that Wikidata must love librarians. Bibliography supports students and researchers in any topic, but open and machine-readable bibliographic data even more so, outside the silo. Cue the WikiCite initiative, which was meeting in conference this week, in the Bay Area of California.

Wikidata training for librarians at WikiCite 2018

In fact there is a broad scope: "Open Knowledge Maps via SPARQL" and the "Sum of All Welsh Literature", identification of research outputs, Library.Link Network and Bibframe 2.0, OSCAR and LUCINDA (who they?), OCLC and Scholia, all these co-exist on the agenda. Certainly more library science is coming Wikidata's way. That poses the question about the other direction: is more Wikimedia technology advancing on libraries? Good point.

Wikimedians generally are not aware of the tech background that can be assumed, unless they are close to current training for librarians. A baseline definition is useful here: "bash, git and OpenRefine". Compare and contrast with pywikibot, GitHub and mix'n'match. Translation: scripting for automation, version control, data set matching and wrangling in the large, are on the agenda also for contemporary library work. Certainly there is some possible common ground here. Time to understand rather more about the motivations that operate in the library sector.

Links

Account creation is now open on the ScienceSource wiki, where you can see SPARQL visualisations of text mining.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:20, 30 November 2018 (UTC)

Facto Post – Issue 19 – 27 December 2018

Facto Post – Issue 19 – 27 December 2018
Extended content

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Learning from Zotero

Zotero is free software for reference management by the Center for History and New Media: see Wikipedia:Citing sources with Zotero. It is also an active user community, and has broad-based language support.

Zotero logo

Besides the handiness of Zotero's warehousing of personal citation collections, the Zotero translator underlies the citoid service, at work behind the VisualEditor. Metadata from Wikidata can be imported into Zotero; and in the other direction the zotkat tool from the University of Mannheim allows Zotero bibliographies to be exported to Wikidata, by item creation. With an extra feature to add statements, that route could lead to much development of the focus list (P5008) tagging on Wikidata, by WikiProjects.

Zotero demo video

There is also a large-scale encyclopedic dimension here. The construction of Zotero translators is one facet of Web scraping that has a strong community and open source basis. In that it resembles the less formal mix'n'match import community, and growing networks around other approaches that can integrate datasets into Wikidata, such as the use of OpenRefine.

Looking ahead, the thirtieth birthday of the World Wide Web falls in 2019, and yet the ambition to make webpages routinely readable by machines can still seem an ever-retreating mirage. Wikidata should not only be helping Wikimedia integrate its projects, an ongoing process represented by Structured Data on Commons and lexemes. It should also be acting as a catalyst to bring scraping in from the cold, with institutional strengths as well as resourceful code.

Links

Diversitech, the latest ContentMine grant application to the Wikimedia Foundation, is in its community review stage until January 2.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 19:08, 27 December 2018 (UTC)

Hello friends, another headache biology article I'm hoping you all can help with. Is endoexocytosis different than endocytosis/exocytosis, or has someone gone and conflated the two? I can't make heads or tails of it. ♠PMC(talk) 13:53, 5 January 2019 (UTC)

It's possible that it's another term for diacytosis. Natureium (talk) 20:07, 5 January 2019 (UTC)
Not a conflation, but it seems like this is terminology with very limited uptake (groan... pun somewhat intended ;) Looking at the main source for the article (Nickel et al 2008), it seems to be a self-conscious coinage by the authors of that paper. From the discussion section, This internalization is thus, an "endoexocytic" process in which a decision must be made as to which cell will incorporate the internalized pentilaminar membranes. I can't find any other significant usage of the term; other search hits are mostly things like "endo/exocytosis", and the articles citing Nickel mostly don't seem to pick it up either. (The only other usages I could find also have Murray as the senior author.) Also, the other citations in our article are just a subset of the papers cited in Nickel. So our article is pretty much an abstract of this one decade-old primary source. I'd prod it, with no prejudice toward re-creation if the term does eventually catch on. (No shame in that... when I came back to WP a few years ago, I prodded one of my own decade-old stubs created to explain a term that had gone obsolete in the meantime...) Opabinia regalis (talk) 21:57, 5 January 2019 (UTC)
Sorry for not responding to this earlier, it kind of slipped my mind. I'm going to PROD it now. Thanks everyone :) ♠PMC(talk) 21:47, 10 January 2019 (UTC)

We suck

Somehow this survived over a year and 800,000 pageviews on Cellular respiration. I've no idea how I didn't catch it at the time, but I guess the take-home message is that we need more people watching articles. Adrian J. Hunter(talkcontribs) 08:03, 16 January 2019 (UTC)

@Adrian J. Hunter: Yeesh, I hate finding ones that have been long-standing like that. A lot of these sorts of issues would be improved with growing the editor base for watching and spot-checking articles, but that's been an ongoing priority for years! Even rating all the backlog of unrated pages last year lead me to some errors that have been around for over a decade. T.Shafee(Evo&Evo)talk 01:31, 21 January 2019 (UTC)

MINAS

Could someone have a look at MINAS to see if it's notable? Thanks in advance. – Uanfala (talk) 00:48, 21 January 2019 (UTC)

@Uanfala: It's probably less notable than MMDB and no more notable than metalPDB (which doesn't have a page). However I'm probably an inclusionist when it comes to these databases so I think that it is sufficiently notable to stay in as a stub. I'd probably draw the line at FREP though, which has only been cited 11 times since 2004, and has been offline for a few years now. T.Shafee(Evo&Evo)talk 01:42, 21 January 2019 (UTC)

Could someone who knows the topic take a look at these two diffs, [2] and [3], and tell me if they seem promotional/suspicious? I don't know enough to know if what they're writing is legitimate science or promotion. The reason I'm suspicious is because the account seems to edit very intermittently, making occasional huge additions to random, totally disparate topics, and it just feels off to me. ♠PMC(talk) 04:53, 21 January 2019 (UTC)

@Premeditated Chaos: The edits to Dielectrophoresis seem reasonable to me, since dielectrophoretic cell sorting is a notable application. I'm not keen on referring to it by a particular brand of machine, but then we do the equivalent in the Illumina dye sequencing article. The Single-cell analysis edits are definitely unbalanced. Dielectrophoretic cell sorting is pretty niche, so definitely the lead now over-emphasises it, but the rest of the edits are more reasonable. T.Shafee(Evo&Evo)talk 06:51, 21 January 2019 (UTC)

Facto Post – Issue 20 – 31 January 2019

Facto Post – Issue 20 – 31 January 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Everything flows (and certainly data does)

Recently Jimmy Wales has made the point that computer home assistants take much of their data from Wikipedia, one way or another. So as well as getting Spotify to play Frosty the Snowman for you, they may be able to answer the question "is the Pope Catholic?" Possibly by asking for disambiguation (Coptic?).

Amazon Echo device using the Amazon Alexa service in voice search showdown with the Google rival on an Android phone

Headlines about data breaches are now familiar, but the unannounced circulation of information raises other issues. One of those is Gresham's law stated as "bad data drives out good". Wikipedia and now Wikidata have been criticised on related grounds: what if their content, unattributed, is taken to have a higher standing than Wikimedians themselves would grant it? See Wikiquote on a misattribution to Bismarck for the usual quip about "law and sausages", and why one shouldn't watch them in the making.

Wikipedia has now turned 18, so should act like as adult, as well as being treated like one. The Web itself turns 30 some time between March and November this year, per Tim Berners-Lee. If the Knowledge Graph by Google exemplifies Heraclitean Web technology gaining authority, contra GIGO, Wikimedians still have a role in its critique. But not just with the teenage skill of detecting phoniness.

There is more to beating Gresham than exposing the factoid and urban myth, where WP:V does do a great job. Placeholders must be detected, and working with Wikidata is a good way to understand how having one statement as data can blind us to replacing it by a more accurate one. An example that is important to open access is that, firstly, the term itself needs considerable unpacking, because just being able to read material online is a poor relation of "open"; and secondly, trying to get Creative Commons license information into Wikidata shows up issues with classes of license (such as CC-BY) standing for the actual license in major repositories. Detailed investigation shows that "everything flows" exacerbates the issue. But Wikidata can solve it.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:53, 31 January 2019 (UTC)

Facto Post – Issue 21 – 28 February 2019

Extended content
Facto Post – Issue 21 – 28 February 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

What is a systematic review?

Systematic reviews are basic building blocks of evidence-based medicine, surveys of existing literature devoted typically to a definite question that aim to bring out scientific conclusions. They are principled in a way Wikipedians can appreciate, taking a critical view of their sources.

PRISMA flow diagram for a systematic review

Ben Goldacre in 2014 wrote (link below) "[...] : the "information architecture" of evidence based medicine (if you can tolerate such a phrase) is a chaotic, ad hoc, poorly connected ecosystem of legacy projects. In some respects the whole show is still run on paper, like it's the 19th century." Is there a Wikidatan in the house? Wouldn't some machine-readable content that is structured data help?

File:Schittny, Facing East, 2011, Legacy Projects.jpg
2011 photograph by Bernard Schittny of the "Legacy Projects" group

Most likely it would, but the arcana of systematic reviews and how they add value would still need formal handling. The PRISMA standard dates from 2009, with an update started in 2018. The concerns there include the corpus of papers used: how selected and filtered? Now that Wikidata has a 20.9 million item bibliography, one can at least pose questions. Each systematic review is a tagging opportunity for a bibliography. Could that tagging be reproduced by a query, in principle? Can it even be second-guessed by a query (i.e. simulated by a protocol which translates into SPARQL)? Homing in on the arcana, do the inclusion and filtering criteria translate into metadata? At some level they must, but are these metadata explicitly expressed in the articles themselves? The answer to that is surely "no" at this point, but can TDM find them? Again "no", right now. Automatic identification doesn't just happen.

Actually these questions lack originality. It should be noted though that WP:MEDRS, the reliable sources guideline used here for health information, hinges on the assumption that the usefully systematic reviews of biomedical literature can be recognised. Its nutshell summary, normally the part of a guideline with the highest density of common sense, allows literature reviews in general validity, but WP:MEDASSESS qualifies that indication heavily. Process wonkery about systematic reviews definitely has merit.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:02, 28 February 2019 (UTC)

GAI (Arabidopsis thaliana gene)

GAI (Arabidopsis thaliana gene) is tagged with this project so I thought I'd ask here. The article has two infoboxes, one {{Infobox nonhuman protein}} and another is a manually created infobox inside the article for "Gene ontology". Is there an already existing infobox that does what the "Gene ontology" one does? --Gonnym (talk) 08:48, 20 March 2019 (UTC)

Our favorite yeast? Probably not. But no pretense of compliance with WP:Before. 7&6=thirteen () 18:50, 24 March 2019 (UTC)

Facto Post – Issue 22 – 28 March 2019

Extended content
Facto Post – Issue 22 – 28 March 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

When in the cloud, do as the APIs do

Half a century ago, it was the era of the mainframe computer, with its air-conditioned room, twitching tape-drives, and appearance in the title of a spy novel Billion-Dollar Brain then made into a Hollywood film. Now we have the cloud, with server farms and the client–server model as quotidian: this text is being typed on a Chromebook.

File:Cloud-API-Logo.svg
Logo of Cloud API on Google Cloud Platform

The term Applications Programming Interface or API is 50 years old, and refers to a type of software library as well as the interface to its use. While a compiler is what you need to get high-level code executed by a mainframe, an API out in the cloud somewhere offers a chance to perform operations on a remote server. For example, the multifarious bots active on Wikipedia have owners who exploit the MediaWiki API.

APIs (called RESTful) that allow for the GET HTTP request are fundamental for what could colloquially be called "moving data around the Web"; from which Wikidata benefits 24/7. So the fact that the Wikidata SPARQL endpoint at query.wikidata.org has a RESTful API means that, in lay terms, Wikidata content can be GOT from it. The programming involved, besides the SPARQL language, could be in Python, younger by a few months than the Web.

Magic words, such as occur in fantasy stories, are wishful (rather than RESTful) solutions to gaining access. You may need to be a linguist to enter Ali Baba's cave or the western door of Moria (French in the case of "Open Sesame", in fact, and Sindarin being the respective languages). Talking to an API requires a bigger toolkit, which first means you have to recognise the tools in terms of what they can do. On the way to the wikt:impactful or polymathic modern handling of facts, one must perhaps take only tactful notice of tech's endemic problem with documentation, and absorb the insightful point that the code in APIs does articulate the customary procedures now in place on the cloud for getting information. As Owl explained to Winnie-the-Pooh, it tells you The Thing to Do.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:46, 28 March 2019 (UTC)

Facto Post – Issue 24 – 17 May 2019

Facto Post – Issue 24 – 17 May 2019
Text mining display of noun phrases from the US Presidential Election 2012

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.
Semantic Web and TDM – a ContentMine view

Two dozen issues, and this may be the last, a valediction at least for a while.

It's time for a two-year summation of ContentMine projects involving TDM (text and data mining).

Wikidata and now Structured Data on Commons represent the overlap of Wikimedia with the Semantic Web. This common ground is helping to convert an engineering concept into a movement. TDM generally has little enough connection with the Semantic Web, being instead in the orbit of machine learning which is no respecter of the semantic. Don't break a taboo by asking bots "and what do you mean by that?"

The ScienceSource project innovates in TDM, by storing its text mining results in a Wikibase site. It strives for compliance of its fact mining, on drug treatments of diseases, with an automated form of the relevant Wikipedia referencing guideline MEDRS. Where WikiFactMine set up an API for reuse of its results, ScienceSource has a SPARQL query service, with look-and-feel exactly that of Wikidata's at query.wikidata.org. It also now has a custom front end, and its content can be federated, in other words used in data mashups: it is one of over 50 sites that can federate with Wikidata.

The human factor comes to bear through the front end, which combines a link to the HTML version of a paper, text mining results organised in drug and disease columns, and a SPARQL display of nearby drug and disease terms. Much software to develop and explain, so little time! Rather than telling the tale, Facto Post brings you ScienceSource links, starting from the how-to video, lower right.

ScienceSourceReview, introductory video: but you need run it from the original upload file on Commons
Links for participation

The review tool requires a log in on sciencesource.wmflabs.org, and an OAuth permission (bottom of a review page) to operate. It can be used in simple and more advanced workflows. Examples of queries for the latter are at d:Wikidata_talk:ScienceSource project/Queries#SS_disease_list and d:Wikidata_talk:ScienceSource_project/Queries#NDF-RT issue.

Please be aware that this is a research project in development, and may have outages for planned maintenance. That will apply for the next few days, at least. The ScienceSource wiki main page carries information on practical matters. Email is not enabled on the wiki: use site mail here to Charles Matthews in case of difficulty, or if you need support. Further explanatory videos will be put into commons:Category:ContentMine videos.


If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 18:52, 17 May 2019 (UTC)

Facto Post – Issue 23 – 30 April 2019

Extended content
Facto Post – Issue 23 – 30 April 2019

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Completely clouded?
Cloud computing logo

Talk of cloud computing draws a veil over hardware, but also, less obviously but more importantly, obscures such intellectual distinction as matters most in its use. Wikidata begins to allow tasks to be undertaken that were out of easy reach. The facility should not be taken as the real point.

Coming in from another angle, the "executive decision" is more glamorous; but the "administrative decision" should be admired for its command of facts. Think of the attitudes ad fontes, so prevalent here on Wikipedia as "can you give me a source for that?", and being prepared to deal with complicated analyses into specified subcases. Impatience expressed as a disdain for such pedantry is quite understandable, but neither dirty data nor false dichotomies are at all good to have around.

Issue 13 and Issue 21, respectively on WP:MEDRS and systematic reviews, talk about biomedical literature and computing tasks that would be of higher quality if they could be made more "administrative". For example, it is desirable that the decisions involved be consistent, explicable, and reproducible by non-experts from specified inputs.

What gets clouded out is not impossibly hard to understand. You do need to put together the insights of functional programming, which is a doctrinaire and purist but clearcut approach, with the practicality of office software. Loopless computation can be conceived of as a seamless forward march of spreadsheet columns, each determined by the content of previous ones. Very well: to do a backward audit, when now we are talking about Wikidata, we rely on integrity of data and its scrupulous sourcing: and clearcut case analyses. The MEDRS example forces attention on purge attempts such as Beall's list.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:27, 30 April 2019 (UTC)

Facto Post – Issue 2 – 13 July 2017

Facto Post – Issue 2 – 13 July 2017
Extended content

Editorial
Core models and topics

Wikimedians interest themselves in everything under the sun — and then some. Discussion on "core topics" may, oddly, be a fringe activity, and was popular here a decade ago.

The situation on Wikidata today does resemble the halcyon days of 2006 of the English Wikipedia. The growth is there, and the reliability and stylistic issues are not yet pressing in on the project. Its Berlin conference at the end of October will have five years of achievement to celebrate. Think Wikimania Frankfurt 2005.

Progress must be made, however, on referencing "core facts". This has two parts: replacing "imported from Wikipedia" in referencing by external authorities; and picking out statements, such as dates and family relationships, that must not only be reliable but be seen to be reliable.

In addition, there are many properties on Wikidata lacking a clear data model. An emerging consensus may push to the front key sourcing and biomedical properties as requiring urgent attention. Wikidata's "manual of style" is currently distributed over thousands of discussions. To make it coalesce, work on such a core is needed.

Links


Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

Most of the articles unintentionally citing retractions are of related to this project. If someone could take a look at those, that would be great. Headbomb {t · c · p · b} 16:25, 21 February 2019 (UTC)

@Headbomb: Thanks for the notification! I've checked whether there are non-retracted sources to support the statements in a couple of these and will try help with some more this weekend. Massive thanks to those editors who add {{retracted}} templates (in particular Rich Farmbrough) - vital work! T.Shafee(Evo&Evo)talk 23:20, 21 February 2019 (UTC)
YOu are welcome, last time I did this was many years ago. Perhaps I can revisit this. All the best: Rich Farmbrough, 09:30, 22 February 2019 (UTC).
Perhaps some day there will be a way to automate it via a bot. T.Shafee(Evo&Evo)talk 10:29, 22 February 2019 (UTC)
There is a User:RetractionBot in the making. It's waiting for access to the Retraction Watch database though. Headbomb {t · c · p · b} 22:59, 22 February 2019 (UTC)
Cool! I had no idea this was something we were even trying to track, but it's a really good idea. Opabinia regalis (talk) 06:00, 23 February 2019 (UTC)
@Headbomb, any updates on this bot? It sounds incredible. Prometheus720 (talk) 00:29, 12 April 2019 (UTC)

I've cleared the backlog of MCB articles with unknown importance--but the page won't update. Help?

Category_talk:Unknown-importance_MCB_articles

I went through every single page. Probably at least a third already had importance assessments. I'd like this to clear fully so that anything which enters it really needs to be assessed. Any tips on what might be happening? Prometheus720 (talk) 20:19, 11 April 2019 (UTC)

This seems like a good question for the techy people at WP:VPT. Natureium (talk) 20:26, 11 April 2019 (UTC)
It's because those pages are tagged with {{WikiProject Microbiology|mcb=yes}} without including |mcb_importance=whatever. MCB project tagging should probably just be removed from the micro template and MCB templates added wherever that parameter was previously used. Seppi333 (Insert ) 21:34, 11 April 2019 (UTC)
Curious. I never noticed that "feature" of the micro template. Just removed it from the handful of articles that had it. I'll remove it from the template as well unless anyone feels strongly for its inclusion. I assume it's just a holdover from the olden days. Ajpolino (talk) 22:29, 11 April 2019 (UTC)
@Ajpolino, I noticed that tag and thought it was weird. So are you saying you are sure it is gone for good from all articles in which it was present? I am for removing it from the template, personally. I think that most tags like that generally create clutter and overhead.Prometheus720 (talk) 00:32, 12 April 2019 (UTC)