Google Blogoscoped

Forum

Illustrated Google Wishlist: Google Music  (View post)

Jeremy [PersonRank 0]

Wednesday, March 21, 2007
17 years ago4,982 views

That's it? That is as far as y'all can imagine Google Music to be?

What about query by tapping or beatboxing, to find similar rhythms? Could be useful for finding good dance music, for everything from club/discothec dancing to your grandparents' waltz music.

What about query by timbre, where if you like the sound of a certain fuzzy guitar, or the sound of a particular bass drum hit, or the sound of someone's vibrato, you can find other pieces of music that have similar usages of that timbre?

What about query by harmonic progression, so that you can find songs which are of similar harmonic complexity. Could be useful for finding 12-bar blues and similar types of music with harmonic structure.

What about automatic cover detection? Suppose there were a system that looked at your collection, and saw that you liked reggae. And it looked at your system and saw that you had a copy of the Beatles' "Yesterday". Well, then it might recommend to you, not just any reggae song, but a cover of "Yesterday" that was done by a reggae band/in a reggae style.

What about query by mood, in which all the songs might be from different genres, but all shared some sort of similar "energy" level. So if you are throwing a dinner party, the system would automatically find all the songs in your collection that matched a "conversational" mood, whereas when you are playing video games with your friends, songs of a different mood would be played. Automatically, through the magic of search.

There is so much great stuff in this area, and so much work that is being done, and I don't see many of the major search engines participating in any of it.

The query by humming stuff, though.. people need to move beyond that. It's terribly old hat.

Colin Colehour [PersonRank 10]

17 years ago #

Google Music should also have something similar to the Pandora service. This would allow me to find songs that are similar to my favorite tracks. This is better than just having a genre listing as it would offer better suggestions to my query search.

Eytan Buchman [PersonRank 10]

17 years ago #

I am with Jeremy here...that isn't innovative enough for Google. Seeing how there already companies that do that, I would expect more from Google. I don't think they like getting users by using their popularity alone. Maybe they could have a music player online that could play copyrighted songs by supporting them with audio/text ads?

Philipp Lenssen [PersonRank 10]

17 years ago #

> The query by humming stuff, though.. people need to
> move beyond that. It's terribly old hat.

Search was an old hat in 1998, and then Google came along and made it actually work. Not because of a complex interface but because of scope, relevancy & speed. And I have yet to see a working sing-to-find engine, Jeremy (Midomi is much too limited in scope & relevancy). So I don't expect something creative here at first, and not something "music power-user" oriented like query by harmonic progression, but simply a system that works on great scale, with millions of songs, returning the most relevant stuff at high speed. Of course there is lots of music analysis (possibly of the sort you mention) in the backend, to help casual users with casual queries. And that's all that's needed to have a great product (and of course that scaling is the hard part).

Jeremy [PersonRank 0]

17 years ago #

Philipp, the problem with query by humming is that it has been well-studied, and found not to work very well, even with a human baseline. Assuming that machines can only approach the effectiveness of humans, rather than surpass them, this does not bode very well for the QBH search paradigm. (Note: I explicitly mean effectiveness, not efficiency. If a human can only do a task with 70% or 50% accuracy, then there isn't much chance of a machine being more accurate.)

In particular, take a look at this 2003 workshop paper, explaining the problem:

www.music-ir.org/evaluation/wp3/wp3_pardo_query.pdf

We are talking here about a backend collection of only a few hundred songs, which songs are known very intimately by all three researchers. One researcher would hum a query, and the other two researchers would try to guess which of the only few hundred songs were meant. Accuracy is pretty low. So if a human cannot do it on a collection of only a few hundred songs, what happens when a machine tries to do it on a collection of a few hundred THOUSAND songs? I would like to go on the record by saying that the task won't be solved in the next 20 years.

Oh, and when talking about query by harmonic progression, I have said nothing about "music power user". The user interface could be something as simple as picking a song that you like. Then the system, underneath, would use harmonic progression as the similarity mechanism, rather than melodic note sequences (QBH).

In that way, instead of the user saying "I like classical, find me more classical songs", the user can say "I like this Rachmaninov piece. Find me something similar" and the system will find something with equal harmonic complexity, rather than just any old simple folk ditty. But the actual mechanisms used by the system can be hidden from the user, so that the user doesn't have to be a "power user" in order to use the harmonic matching similarity mechanism.

The point is, there are so many more ways of doing music search than simply "hum a sequence of notes and find other songs with that same sequence". I was trying to expand your mind as to the wide range of possibilities.

Philipp Lenssen [PersonRank 10]

17 years ago #

> In that way, instead of the user saying "I like
> classical, find me more classical songs", the user can
> say "I like this Rachmaninov piece. Find me something
> similar" and the system will find something with
> equal harmonic complexity, rather than just any old
> simple folk ditty.

Absolutely agreed! A simple interface that let's you click a "similar songs" link, and then some complex background computation. I think our concepts aren't far apart, except that you describe the backend (the hard part!), I described the frontend (the easy part – I am actually just illustrating a wish with mockups!).

As for the "humming can't work" theory, let's wait and see. I don't think we can assume that just because humans aren't great at this, that machines can't be great. There are so many examples of where machines beat humans, and IMO this can include complex "song essence distillation" plus brute force pattern matching to return best matches. The human eye may not be able to differentiate between rgb(200,200,10) and rgb(200,200,11), but who says a computer can't? <-- just a random example, there are thousands others, and I agree this random example is not analogous to music matching.

MarWi [PersonRank 3]

17 years ago #

What I am looking for is an online tool that recognizes songs when you play them for 20 seconds or so (for example: you hear a song on the radio and want to know it's name). There is a (pay) phone service that does just this and does it pretty well, but it would be nice to have something similar online and for free.

spanish [PersonRank 0]

17 years ago #

a free program called "tunatic" does it, and very well!

instead of using speakers and mic use the "recording trick", to send what is played on your pc, without mic ;) .....recording--->microphone (change it to stereo mix).

;)

Jeremy [PersonRank 0]

17 years ago #

Ok ok Philipp. We agree. Darn! :-) It's more fun to disagree :-)

In particular, the following thing you wrote about song essence distillation is exactly what I am talking about:

>There are so many examples of where machines beat humans,
>and IMO this can include complex "song essence distillation" plus
>brute force pattern matching to return best matches.

There are so many interesting algorithmic ways to do "song essence distillation". And so little work is being done in this area by the major search engines. With all their computational resources, I find that quite disheartening.

But even when they do start to throw large amounts of processing power at the problem, I really wouldn't call it "machines beating humans". I was talking about effectiveness, rather than efficiency. Yes, machines can certainly do it much faster than humans. But can they do it more accurately than humans? Let's take beat tracking, for example. A human can very easily find the "tactus" of a song, i.e. the beat at which the musician intends you to clap or nod your head.

However, a machine gets confused, because that tactus information, while easily human understandable, is not really in the signal. The machine will often place the tactus at 2x or 1/2x the "true" tactus. A machine can find the correct "phase" (it can be "on beat"), but it often cannot determine the correct frequency of the signal at that phase.

Similarly, for the "query by humming" example you started with, the main problem is that humans cannot sing the correct notes. And when you cannot sing the correct notes, you have a classic vocabulary mismatch problem. A search engine can only give you what you ask for, with minor spelling correction suggestions. But in music, a minor spelling correction becomes a different song. Depending on which note was incorrectly hummed, you get a whole different possibly relevant set of songs... and the computer can't tell which set you really meant!

Here is an imperfect text example. Suppose you meant to type the word "food". However, instead of "f", you mis"sung" (mistyped) and wrote "good". Or "hood". Or instead of the second "o", you mistyped and spelled "ford". Or instead of the last "d" you mistyped, and spelled "fool".

So we have the case where people are coming in and typing:

good
hood
ford, or
fool

and in each case, what they really meant was "food". But because of a single wrong letter, it completely changed the query. How is a human supposed to know that you meant "food", when you type "ford"? And if a human can't know, how can a machine?

And what happens when the human how mistypes two letters, instead of one, as often happens when humming a query? "food" becomes "plod" or "foal" or "soon" or "pool"? No amount of "wisdom of crowds" aggregate spelling correction is going to help the machine know that the user meant "food" when the user actually types "plod".

One last point. You write:

>The human eye may >not be able to differentiate between
>rgb(200,200,10) and >rgb(200,200,11), but who says a computer
>can't? <-- just a random >example, there are thousands others, and >I agree this random >example is not analogous to music matching.

This is actually one of the sources of the problem! The computer is being too literal here. The user only sees the color "puke yellow" (or however you want to describe that RGB combination). But the computer sees two different colors. If the user searches for pictures with "puke yellow", the computer has to know how to translate the human understanding of color (which is fuzzy and interpreted) into the literal machine values that the computer only understands.

I don't have time for a full tutorial on color and digital photography, but humans will often perceive a different color than the machine sees. Here is a picture that illustrates what I mean:

www.planetperplex.com/en/item2

If I am using a search algorithm to find the "white" squares in this picture above, there will be a problem. Because one of the white squares is actually the same RGB color as one of the black squares! A human can tell the difference between the black square and the white square. But a machine, since it only has access to the RGB values, cannot.

There exist many similar problems for music. Verstehst?

MarWi [PersonRank 3]

17 years ago #

[put at-character here] spanish: Thanks a lot!

elyk [PersonRank 6]

17 years ago #

good idea overall...but why not offer the music in an open format such as ogg in addition to mp3? that way they can appeal to the linux users who want a purely oss machine but can't install open source mp3 enocoders because of legal issues.

DPic [PersonRank 10]

17 years ago #

I was just about to suggest ogg as well as FLAC.

And Phillip and Jeremy, you forgot the simplest of all- searching by singing the lyrics! :P

Also, to add on to the Pandora idea, i love the concept of Pandora, but i don't like their service. If Google could take that idea, offer a huge library of music and find relevant music and create personalized playlists, that would be amazing.

About categorizing them into genres- searching by genre can be difficult. A method i think might work (though i don't really know what i am talking about) would be to not only have songs that fit under one or two genres but to instead display only one genre and a few sub-genres but behind the scenes have some sort of ranking that the song has in all genres which could be effective in search relevancy.

I would also love to be able to be able to access the song from Google at anytime and have it played on my blog. So i could download a copy on my computer and have a hosted copy available to me through Google.

It would be great to be able to set up an internet station using this tool or a playlist for my friends to listen to. So Google could function as last.fm AND Pandora! Then Google Music can have music trends merged in and also have integration with Orkut 2.0!

Oh the possibilities are endless! And the world of music is so big! The only problem is restrictive copyright laws and stuff. I think this is something we should definitely discuss more about how it should work since Google might see good reason to get into this.

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!