Thursday, August 17, 2017

icons - What would be a good way to thumbnail audio?


I want to represent an audio file in an image with a maximum size of 180×180 pixels.


I have been thinking about something similar to what Soundcloud does, although it will require some chopping and stretching to mash it into the square:


Screenshot of Soundcloud's player



I wonder if any of you have something for this. I have been searching around for a bit, mainly audio visualization and audio thumbnailing, but I have not found anything useful.



Answer



Good question. In terms of UX there are a few different things you want in a thumbnail.




  1. Thumbnails should be easily distinguished so you may wish to use some sort of visual hash (eg. here)




  2. Similar items should look similar and different items should look different. Ideally you will want to encode a few things that are likely to remain similar across a sensible group of sounds. Here are some ideas as well as what they might help differentiate:




    • Length (eg. podcasts, short samples); this should probably be shown logarithmically

    • Some Frequency domain representation (deathmetal vs pop vs talking)

    • volume histogram (procedurally generated vs 'organic' sounds)

    • meta-data if it is available (filetype, stream-rate etc)




  3. I imagine the naming (assuming these files have names) will be up to the user/consumer, this is the fallback if the user can't interpret the thumbnail so make sure it's clear and prominent enough just in case. (this problem appears with icons all the time, see UX Myth #13: Icons enhance usability)





  4. Ideally the thumbnail should be memorable, though how to do this with sound I have no idea.




Finally you want all of this to be clear on your 180x180 thumbnail. Doing this well will doubtless be very challenging.


Perhaps something like this (though perhaps this is overkill or too cpu intensive):




  1. Map the default hue to a mixture of some or all of the items in 2 above, this should give you similar color files for groups of files.





  2. Frequency spectrogram in the background for saturation




  3. Volume histogram can be represented by changing the hue of buckets (ie. squares in your 180x180 representation, 4 in my example, though you should use more)




  4. Duration can be mapped logarithmically to a border surrounding the whole thing in the form of a bevel. (7 minute files will look almost exactly the same as 6 minute files, but completely different from 7 second files)




Here's an example of what it might look like:



Example of a sound file thumbnail


Hope this helps


No comments:

Post a Comment

technique - How credible is wikipedia?

I understand that this question relates more to wikipedia than it does writing but... If I was going to use wikipedia for a source for a res...