performance - PHP: Improve speed returning Mesh terms from Entrez database (Pubmed) -
i want extract mesh terms search results in pubmed database. using php.
i made script works slow. opens each article, parses xml , retrieves mesh terms. "fopen" function slow part.
$url= $base."efetch.fcgi?db=$db&id=$id&rettype=abstract"; $opts = array( 'http' => array( 'method' => "get", 'header' => "user-agent:myagent/1.0\r\n" ) ); $context = stream_context_create($opts); $fp = fopen($url,'r',false,$context); $output=stream_get_contents($fp);
the script opens each article big xml file: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=19616537&retmode=xml
is there way retrieve mesh terms, or @ least smaller part of xml? or load half of file?
thank you
update: got improvement. using efetch retmode=text
, rettype=medline
reduced download 1 file 15 kb 4kb. bundled downloads reduce amount of requests.
it takes 4.8s load 500 results.
i still want faster.
does have tips?
i'm not sure limits , goal. if quering whole database try other way around. query database list articles every known mesh term. far know there "27,149 descriptors in 2014 mesh", need send less 30 thousands queries results whole database. can achieve using europe pmc restful web service.
Comments
Post a Comment