Wikipedia API - grab a reference table?

Does MediaWiki provide a way to return the information in the Help table? (usually to the right of the article page) For example, I would like to grab Origin from Radiohead:

http://en.wikipedia.org/wiki/Radiohead

Or do I need to parse an html page?

+3
source share
3 answers

You can use the property revisionsalong with the parameter rvgeneratexmlto generate a parsing tree for the article. Then you can apply XPath or go through it and look for the information you need.

Here is a sample code:

$page = 'Radiohead';
$api_call_url = 'http://en.wikipedia.org/w/api.php?action=query&titles=' .
    urlencode( $page ) . '&prop=revisions&rvprop=content&rvgeneratexml=1&format=json';

You must identify yourself with the API, see the Meta Wiki for more details .

$user_agent = 'Your name <your email>';

$curl = curl_init();
curl_setopt_array( $curl, array(
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_USERAGENT => $user_agent,
    CURLOPT_URL => $api_call_url,
) );
$response = json_decode( curl_exec( $curl ), true );
curl_close( $curl );

foreach( $response['query']['pages'] as $page ) {
    $parsetree = simplexml_load_string( $page['revisions'][0]['parsetree'] );

XPath, Infobox musical artist Origin . . XPath .. . , .

    $infobox_origin = $parsetree->xpath( '//template[contains(string(title),' .
        '"Infobox musical artist")]/part[contains(string(name),"Origin")]/value' );

    echo trim( strval( $infobox_origin[0] ) );
}
+4

MediaWiki, , ( , Semantic MediaWiki, , ). HTML wikitext, (, /), API.

+1

, DBpedia , .

" ", , "Infobox" , DBpedia . , , , - . , (-:

DBpedia , .

SO- : DBPedia Infobox

UPDATE

OK SPARQL:

SELECT ?org
WHERE {
    <http://dbpedia.org/resource/Radiohead> dbpprop:origin ?org
}

URL, , .

: ( )

SPARQL: org "Abingdon, , " @en

+1

All Articles