Project

Due: Tuesday, March 19, 2019 at 11:55 p.m.
Points: 100


Please turn in your solution for this homework assignment on Canvas under Projects in Assignments.


Homework 4 had you produce a list of publication IDs from a keyword search on PubMed. The final project is to produce a list of the publication citations for that keyword.

Begin with your solution to the last homework (or the one on Canvas). From that, you will get a list of PubMed publication IDs; again, have the user enter the number of references and one or more keywords (if there is more than one, separated them by commas). The, for each publication, use the following URL to get the metadata:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&id=idlist

with no spaces and all on a single line, and idlist replaced with the ID list you got from the output of the last assignment. The web page you get back is an XML document giving details of the publications.

Your job is to print a bibliography from this record. Your entry for each journal should look like this:


A. Bester, R. Zelazny, and H. Ellison, “On the Role of Viruses in Future Epidemics,” Journal of Irreproducible Results 3(4) pp. 29–35 (Mar. 2103). PUBMED: 23456789; DOI 12.1119/2847595.

Then print the abstract, if it is present in the record.

If there is no DOI, use the PII. If neither is there, omit that part of the entry.

You will need to look at the XML records to get the fields. These are delimited by tags with attributes, each of which may have a value. For example, the element

<ELocationID EIdType="doi" ValidYN="Y"$>$10.1016/j.vaccine.2015.04.071</ELocationID>
has a tag of ELocationID, attributes of EIdType (with value doi) and ValidYN (with a value of Y), and the field contains 10.1016/j.vaccine.2015.04.071, which (as the EIdType value indicates) is a DOI.

The easiest way to see what the records look like is to run your solution to homework #4, and ask for a single entry. You can then see its structure. The fields of interest will have these tags:

Those will be enough to build the reference, as described above.

You can find methods for processing XML in the Python Library Reference, section 19.7 at https://docs.python.org/3.7/library/xml.etree.elementtree.html


Matt Bishop
Department of Computer Science
University of California at Davis
Davis, CA 95616-8562 USA
Last modified: Version of February 28, 2019 at 11:57AM
Winter Quarter 2019
You can get a PDF version of this