Google Blogoscoped

Forum

Yahoo API + PHP5, Unknown UTF-8

Philipp Lenssen [PersonRank 10]

Wednesday, March 2, 2005
15 years ago

I want to try the new Yahoo API, but I get the PHP5 error message

   Unknown encoding "utf-8"

in the marked line below...
(After that, applying the Xpath will fail as symptom of the first bug.)

function showImages($q)
{
// Note I split up the http : // to avoid auto-linking here..

   $image_url = 'http : // api.search.yahoo.com/' .
   'ImageSearchService/V1/imageSearch?' .
   'appid=YahooDemo&query=' . urlencode($q) . '&results=5';

   $dom = new domdocument; // ERROR!

   $dom->load($image_url);
   $xpath = new domxpath($dom);

   $xNodes = $xpath->query('//Result');
   foreach ($xNodes as $xNode)
   {
   // ...
   }
}

Can anyone help? I tried some encoding workarounds but nothing worked... note my direct file access with PHP5 is disabled on my server (using file() or fopen()).

Philipp Lenssen [PersonRank 10]

15 years ago #

Actually, the error happens in the line below the comment I included...

Jason [PersonRank 0]

15 years ago #

Did you download the Yahoo SDK from the developer page? It has a sample php script, and it uses:
rawurlencode($_REQUEST['query'])

so maybe you are using the wrong encoding script. See if that fixes it. Also, you should check out the SDK and documentation.

www.shiwej.com

andrew [PersonRank 0]

15 years ago #

that example php script is crap – the person who wrote it even wrote in the comments:

// Ok, now that we have the results in an easy to use format,
// display them. It's quite ugly because I am using a single
// display loop to display every type and I don't really understand HTML

I don't really understand HTML, yet I was assigned to write an example PHP code that is distributed with the Yahoo API.

Anyways... have you tried just using SimpleXML? It's included by default in PHP5 and could make the process a whole heck of a lot easier.

Rasmus Lerdorf [PersonRank 1]

15 years ago #

Uh, I wrote that example. It's mostly ugly because XML handling in PHP4 is ugly. I wrote a better example for PHP5 which you can see here:

toys.lerdorf.com/archives/33-B ...!-Search-Web-Services.html

As for your error. I am not really sure why you are getting that. Which version of PHP5? I just tried this on my build:

   $image_url = " api.search.yahoo.com/ImageSear ...
   $dom = new domdocument;
   $dom->load($image_url);
   echo $dom->saveXML();

And it worked nicely.

Philipp Lenssen [PersonRank 10]

15 years ago #

I was going for the new native PHP5 DOM/XML support. I saw the SDK script, but it was PHP4, and it indeed didn't look right to me. I wanted to try this with PHP5. I also checked the Python script from the SDK and it said "this is based on the PHP script"...

Thanks Andrew for the tip.

Rasmus Lerdorf [PersonRank 1]

15 years ago #

I had a closer look for your Unknown Encoding error. This appears to be a libxml2 error from the xmlCheckHTTPInput() function in xmlIO.c

My best guess is that your libxml2 was not compiled with iconv support.

Richard M [PersonRank 1]

15 years ago #

So why are you using "header("Content-type: text/html; charset=utf-8");" and not just "header("Content-type: text/html");"?

Rasmus Lerdorf [PersonRank 1]

15 years ago #

Richard, since the XML is coming in as UTF-8, it is easiest to just stay in UTF-8 and pass it through to the browser directly. If you need to process the data in some way, you'll need to utf8_decode() or iconv() it to convert it to an encoding you can work with and then you can send it out in that new encoding if you like.

Note though that the problem Philipp is having has nothing to do with the output encoding. In his case the underlying XML library (libxml2) PHP uses to parse the XML was likely built without iconv support which means it isn't able to handle the UTF-8 encoded content it was asked to parse.

Philipp Lenssen [PersonRank 10]

15 years ago #

I forwarded Rasmus' problem description to my host support but didn't hear back from them yet...

Philipp Lenssen [PersonRank 10]

15 years ago #

I fixed the problem with the Yahoo API output. I created a PHP4 file to pas on the XML, which I then
access with PHP5. In the PHP4 I delete the ollowing attributes from the XML you (Yahoo) provide:

xmlns:xsi="http : / /www.w3.org/ 2001/XMLSchema-instance"
xmlns="urn:yahoo:srchmi" xsi:schemaLocation="urn:yahoo:srchmi
http : //api.search. yahoo.com/ImageSearchService/ V1/ImageSearchResponse.xsd"

Once these lines are deleted, everything works fine. And it's all in UTF-8, so UTF-8 was not the problem. Maybe Yahoo could make the Schema links optional in the API, or help PHP fix their bug (I suppose it's a bug).

Sebastian [PersonRank 0]

15 years ago #

Hi,

may i add that I am having exactly the same problem here!? I am also getting a " Unknown encoding “utf-8” " error.

To me it seems like there is a problem with the encoded query string. Or it is a bug in the DOM libraries???

If I use just one word (e.g. "films"), there is no problem at all. But as soon as my query is a rawurlencoded string with a space
(e.g. "films new").... Booooom!!

Any ideas?

Thanks,
Sebastian

Philipp Lenssen [PersonRank 10]

15 years ago #

Sebastian, I asked Yahoo and I'm waiting for help. I've been told the XML library has a bug or is not the latest version, but it happens with the PHP default setup, so many people may have this problem.

As a workaround for now, you can use the "file" function to remove the Schema declarations on top of the XML file, and then load it via the domdocument. (I even had to write a PHP4 wrapper to pass on to PHP5 because only PHP4 on my server can grab external documents using "file"...)

Philipp Lenssen [PersonRank 10]

15 years ago #

Funny. I wrote to my provider Schlund about the problem a while ago, and they could only refer me to this help – my own blog post! blogoscoped.com/archive/2005-0 ...

Sebastian [PersonRank 0]

15 years ago #

Philipp, thank you for your hint. In the meantime I did some more testing. I generated a request and saved the resulting xml file provided by the yahoo api engine to my local server. If I replace the url with the local file no error is reported. To me it looks like the dom command for opening a xml file translates the url-encoded link back to plain text and thus cancels the request because of the resulting space in it…?

I hope they’ll fix this bug soon as I am not willing to throw away the reported total count of search results.

Michael U [PersonRank 0]

15 years ago #

Is it possible to use eregi_replace to remove the whitespace or actual space and replace with a + sign on multiple phrase searches?

IF I do not use a plus sign in place of an actual space when typing in the seacrh fieldbox the urlencode is read incorrectly by PHP and the script breaks. We know it is PHP becuase if we pass the url with a space to the Y! API it works no problem.

My Version PHP
PHP 5.0.2

Using DOM example from Yahoo SDK kit

This thread is locked as it's old... but you can create a new thread in the forum. 

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!