FuckingWebBrowser

FuckingWebBrowser sonification demo from Michael Takezo Chinen on Vimeo.

FuckingWebBrowser is a simple open-source WebKit based browser (for which there are hundereds of tutorials including many of annoying ones on youtube how to make in less than 2 minutes on a mac.) I added sonification, which converts the memory state into audio. The quality of sound is noise, but an overall structure is recognizable due to changes in the user interface (while the images load, while I mouse over/scroll, etc), which directly affect computer memory. From 5’30” on there is a special sneak preview of a tool codenamed FuckingFucker, which can attach to any process using its PID or BSD name and sonify a dissasembly of the process’s instructions at runtime (eg. and read the registers on the cpu). In this video FuckingFucker is attached to FuckingWebBrowser providing a double sonification.
FuckingFucker is still in development, and is a work in collaboration with Institute of Algorhythmics.

One can download the mac OS app or get the source of FuckingWebBrowser here. Of note, the related project, FuckingAudacity, is also up here. FuckingWebBrowser uses a simple portaudio callback to make sound. It has very little source code and can serve as a good example of how to recompile an open-source app into one that sonifies memory state. I intend to post a tutorial on exactly how to do this soon. A future release will also concern FuckingFucker, which provides the ability to sonify the runtime process of even closed-source apps.

FuckingAudacity video up

Audacity GUI performance from Michael Takezo Chinen on Vimeo.

Audacity is an open source sound editor. I modified the code so that the sound playback depends upon the physical memory state of the program. Because the memory is dependent upon the user interaction, the user interface (buttons, menus, etc) as well as the structure of user-driven processes (effects, importing files, etc) become a part of the sonification process. The noisy quality of the sound is due to the fact that chunks of uninitialized memory are continuously sent to the audio output.

This recording was done with a point and shoot camera, so the audio quality is not so good, but it gives you an idea of what I’m trying to do.

The program is called FuckingAudacity. How to go about building it and programs like this will be up shortly.  This basics of this process are quite simple and can be attached to other open source programs as well.  I will try to document this along with how to attach portaudio to a program if it doesn’t have sound i/o, allowing pretty much any c/c++ open source program to having fucking versions.

This process has actually been quite helpful for finding bugs (and fixing them,) because the performance aspect makes for more extreme use cases.  For example, I fixed a bug that existed in 1.3.10 where pressing play in a second project while another was already playing would cause the audio glitch and stop.  This process also has a healthy result  It is introducing me to more code, and thus more devel mailing lists/communities.   For example, I’ve also been playing with SuperCollider in a similar fashion.

If you are interested here a some related projects that I know of:

Shintaro Miyazaki’s work sonifies computer processes externally:

algoRHYTHMS everywhere Version 0.2 from Shintaro Miyazaki on Vimeo.

These two other projects were on slashdot:
http://www.codesounding.org/indexeng.html
http://cessu.blogspot.com/2008/09/have-you-listened-to-your-program-today.html

  It’s interesting that they attach sonification data to MIDI and some meter, which is a pretty specific sonification scheme.  One of them uses valgrind, which I am interested.  Unfortunately it doesn’t seem to work so well on my 5 year old computer for GUIs.  But I want to find a way to work with a debugger that allows hotfixing/live program analysis (e.g. the call stack) as an extra layer of performance.

Square Wave: busting your mind, not your speakers

I’ve before puzzled over what it would take to create an extended wave that looks like DC offset or low frequency square wave (not an actual impossible square wave with infinite harmonics, but just one that mostly looks squarish.)  This is a case where the input signal and the speaker cone(s) movement and the resulting air pressure wave can look quite different from each other.

Another common time that square waves comes up is when clipping happens in the amp/interface.  This is a known speaker killer, but maybe not in all cases as I originally thought.

Today I actually googled the topics and found the following decent discussions of them.
http://www.diyaudio.com/forums/multi-way/5699-cant-reproduce-square-wave.html

http://www.bcae1.com/2ltlpwr.htm

http://www.rocketroberts.com/techart/spkr.htm

The diyaudio thread is huge, with some interesting parts.  Here are the main points I liked:

  • If a speaker receives a DC, it will begin to move at a more or less constant speed proportional to the amount of voltage.  It will take a certain amount of time for the speaker cone to reach the maximum point it was designed for – which suggests that there are many square waves that will not result in blowing up your speakers.  These friendly square waves would then appear to be of relatively lower amplitude and/or higher frequency.  This is probably not so surprising, but now at least you might feel a little better about playing pan sonic at reasonable volumes.
  • There is a derivative (as in calculus) effect between the input signal to the speaker, the speaker cone, and the waveform produced in the air.   You put a square wave into s speaker.  The cone moves in a triangular fashion.  This creates a square wave in the air.  Of course it doesn’t really matter so much except for phase when dealing with sinusoids with their uncreative deriviation and integration laws.

Of course, there are many square waves that will blow your speakers up.

Next, I want to find out what the hell wind is.

Open Source Mac Screen Recorder

I’ve been doing more GUI based music stuff.  This is partially a response to the fact that I usually don’t enjoy performing tape music live.  Don’t get me wrong, I can enjoy a concert of tape music.  Most of my music is fixed media.  I’m not talking about seeing virtuosity or even expressing emotion.  But I really enjoy when I  see the person who made the music communicate that he actually believes in it during performance.  Some people can do this amazingly well with just a trackpad.  I’ve decided to work more with hacked GUIs because they are tied to the software processand also its like my way of getting back at them for driving me fucking crazy most of the time.

I’ve been working on the website, and I see that I have little of my GUI based music online.  But when it comes down to documenting these things on a mac, I couldn’t find a good free tool that does this.  I am amazed that this doesn’t exist, given the number of media/video/live-coding artists that work on mac.  I looked for this in the past, but just assumed that I wasn’t trying hard enough or I wasn’t using the right search terms (is it a screen recorder or an av capture tool?)  There are shareware programs and such that have crippled versions of their software for free.  But there are some options:

I’ve looked quite a bit by the way.  There is one freeware tool called “Capture Me” which does low frame-rate video capture, but only for a few seconds.  It is meant for single frame capture and actuallyl looks like a good starting point.  Also, it is labeled ‘open source’, but I can’t find the source code anywhere on the developer’s website, sourceforge, or google.  I’ve contacted the developer and will follow up if I get a reply, but he says due to a job change, he probably won’t get back to emails.

There is a tool for recording without audio on mac.  It’s called krut, and its in java.  It’s actually open source.  I’ll post some results on my g4 once I get to try it out.

I’m still thinking about putting together a really shitty, really free, really open source fast-capture audio video screen capture tool, because I need it and I’ve seen the demand.  If anyone is working on this, please get in contact with me.  I’ll help.  Otherwise I’m going to do it very quick and it’ll have bugs, but it’ll get the job done for now.   Fast-food software, if you will.  Maybe it’s a really hard problem that can’t be tackled.

Transition Details

Just moved to wordpress.  The old website pages still exist. But they will take you out of the main page.

The only one I’m not porting over is the dev blog, because its too long:

I’m being lazy and just posting the html here:





Development Log.

Saturday, December 13, 2008



Just in time for the 2008. We’ve been working more diligently than an imaginary development log reader might think. And not just on such great koans as “what thought does an imaginary development blog reader upon reading an imaginary blog with an imaginary subject,” but also on our genetic algorithm. We’ve fixed a lot of bugs in the GUI, but besides that, completely changed the spectral estimate function by doing both real hard empirical tests to create an integral that was both simple enough to run in good time and mapped well enough on to the actual synthesized spectrum it makes.
The other real big change was the implementation of a cache. It is a two dimensional backwards and forwards extra-fancy grade cache that uses extra-fancy hit ratio tracking. The basic idea is that our variable length chromosomes grow bigger and bigger, and so that a single mutation is likely only to change a small part of the sound, in either frequency space, or time space. So what we do is cache every frame’s estimate, and use it if the interpolated parameters match. BUT WAIT, THERE’S MORE! The spectral estimate for a given frame is the sum of the integrals for each noise band. So if a noiseband in the middle of the thing changes, we can’t use the cache’d frame. So what we do is cache the say, 20%,40%,60%, and 80% noiseband marks, and do the same for reverse. So if the 55%th noiseband changes, we go back and add the forward 40% cache and backwards 40%(which is really 60-100%) and then from there manually caculate the integral for the the 40-60% mark and add those. In otherwords, this doesn’t make any sense unless you, fortunately like we did, upgraded our imaginary blog readers to the telepathic-understanding-cache-imaginary-dev-log-readers type.

Also if you’ve been reading the forums, biological evolution seems like it is optimal at an average of 1 mutation per chromosome. It turns out that works well for genetic algorithms too, so we’ve implemented that.

We still have a ways to go, but we can do noisy sounds, such as wind now. Female speech is a long way yet to go, mostly because we need more dynamic window functions (hint hint).

Wind experiment directory of wav synthesis and original for wind/screen source.

Female Speech experiment Bad, but understandable female speech experiment. original and subsequent generations.


Saturday, October 20, 2007


It has been far too long. We presented this project at ICMC in the fashionably expensive city of Copenhagen. Here’s the paper.

As far as development goes, things have changed since then. But as always, since you, pretend reader, are such a defender and demander of quality, let us do naught but make haste to demandee you some:
Noise Band bass examples

Sinusoids that move in amplitude and frequency randomly generated envelopes, with a bandwidth envelope that goes from fully open to fully closed, gradually opening up over one minute. They’ve gotten much better and can trick some people sometimes into believing they are real sounds in the end.

Original bass input sound (bass) wave format, about 300k.

Synthesized result, 13th generation Early result

Synthesized result, 300th generation

Synthesized result, 700th generation

Synthesized result, 1700th generation

Synthesized result, 16000h generation Final result (sounds very similar after about 3000 and on)


Friday, April 13, 2007


Today would have not lived up to its reputation as a friday the thirteenth had we not just finished a new method of synthesis for our engine: sinusoidal based noisebands. And I’m talking full bandwidth noise bands that are not weak wussy filter generated noise bands with their unpredictable phase and amplitudes. I’m talking about grown men monotonic in phase sinusoids that have advantage of being obedient when you want them to be. Last week we did four hours of googling and found no obvious similar techniques, so this small synthesis component may actually be worthy of publishing. However, for your grain of salt, three and four fifths an hour of that time was spent self googling. Have a listen for your self to decide how sweet this is:

Noise Band examples

Sinusoids that move in amplitude and frequency randomly generated envelopes, with a bandwidth envelope that goes from fully open to fully closed, gradually opening up over one minute

80hz Noise Band

400hz Noise Band

1000hz Noise Band

10000hz Noise Band


Thursday, Febuary 1, 2007



There’s some sweet new open source projects on sourceforge, some of which we might have been partly responsible for. The benefits are of course, CVS and public incentive.

check it.

also, here’s some
older installation music that never got posted.


Friday, December 22, 2006



It has been a few months since our last post. We’ve been showing off music at places. Sometimes it was relevant to this research, like the new piece that uses the GA output to make all its noise. In all, we went to three cities in Japan and played music.

However, we are a little happy to get back to research. Lately in our heads we have been thinking of using some kind of subtractive model to make the fitness function really speedy. This still relies on using the FFT. Some of us are thinking to look more into mp3 and ogg encoding to figure out how they represent their spectral data. The sinusoids are getting a little harsh, and if you’ve heard that piece, you’ll know what we’re talking about. Anyway, hang in there, we’ll post more later.


Tuesday, September 5, 2006



In the middle of august we presented our current state of research to a accoustic research society. But as a reader it is not clear what the current state of our research is and apologize about the sentance order used here. Since the last update, a lot has been accomplished – we can now resynthesize certain sounds like cello within a few decibels of spectral distortion. However, we have become to think this sinusoidal based model kind of rather sounds like shit, and are looking for other ways to produce sound. However the results of the current model are below for your listening pleasure.

Last week we demonstrated our genetic algorithm sounds in a real time interactive system that uses four light sensors. We recorded lots of insects in shikoku, analyzed just two of them with our genetic algortihm, and used the output sound to do a 4 channel sound space. It was done in a week using supercollider, so the results are so so. Here is a picture of interested looking and uninterested-looking Japanese playing with it. Listen to the sound below.


Analysis/Synthesis Examples:




Original Recorded sound (bass) wave format, about 300k.

Evolution progression of the resynthesis. wave format, about 2.3mb. A concatenation of the most fit sound in progressive generations.


Live Photosensor demo Excerpt:




Clickit excerpt mp3 format, about 3.3mb.


Wednesday, June 28, 2006



The New Deal

Everything has changed. I’m panicking. The structure of the chromosome is being ripped apart to make way for new interpolated sound generators that guarantee some smoothness. Listen:

Examples:


Chromosomes with Interpolated Sin Sound Roots. Mp3 format, about 2 mb.

Ben is actually 50 chromosomes, each lasting about two seconds, overlapping by one second.

It sounds like fireworks. The little buggers are lively.


Wednesday, June 21, 2006



We had to present some kind of demo with our incomplete software for an open campus event to attract high school students. So this is what we spat at them besides our horrible Japanese.

Examples:


Sound Morphing Trials. Mp3 format, about 2 mbs a pack.

First attempt from a bass note to a clarinet. But we messed up the heuristic function, so it goes wild

2nd attempt is about 200 generations played over a 60 second period, so they overlap, and you hear a sort of Phase vocoder-like effect

2nd attempt, discrete is they same as above, but with only 20 generations, so the sounds are discrete and you can hear the morphing.

These files show how morphing is possible with a genetic algorithm framework. To do it, you simply set the initial generation’s population to be not random, but a chromosome representation of the start sound.


Thursday, June 15th



Osashiburi da ne. (Lets pretend for once that it’s just one guy living in Japan that is writing these logs, (instead of thirteen aliens of different solar systems so enthused by the human race that they get their kicks living on mars remotely controlling the human bodyshell of a guy living in Japan.) Then, we are elligible (pretend-elligible says alien2,) to write in the first person (pretend-first person).)

I just moved across tokyo from Asakusa to Odaiba. This means not being able to afford a moving service. But instead I had the privilege of carrying 200 pounds of stuff in my arms divided amongst four trips on the world’s most crowded trains. It also means that the genetic algorithm had to take a break for a few days.

However, since a whole month has passed since my last post, it should be said that considerable progress has been made. The crossover algorithm got completely heirchical and crazy. Most notably though, is the speedup factor of 300x from our last post under certain conditions. This was accomplished with a dumb cache, a bit of sorting, and the use of heuristic estimation *before* synthesis, allowing us to get away from the bad one (Slays kings, ruins towns and beats high mountains and everything but Tolkien,) without even putting down 44100 samples per second per gene.

Also, for now I’ve really simplified the test problem. I have changed from trying to resynthesize with high granularity but low quality using the previous “Lincolnshire Posy” example, to aiming for a perfect or near perfect resynthesis of a simple clarinet sample. It is easier to go from high quality to low, for one. Also, this is a more scientific and engineering problem now, and does not make it any less artistic, I do not think. “haha, he said ‘I do not think'”. I do not think I do not think here are your moments of I do not think I do not think I do not think I think.

Examples:

Clarinet resynthesis attempt, three second clips: wav format, about 300k. (only using sinusoidal cells.)

First Generation (is a random collection of sin waves)

11th Generation, fittest has found a low sinusoid near the fundamental

1000th Generation, fittest has found many of the harmonics, but lots are out of tune

10000th Generation, fittest has found all of the harmonics, and tuned them, but the phase and attacks are still missing.

In the end, you can hear a really rough sound of a clarinet, but the sinusoids in the resynthesis do not ‘bend’ like the real ones in a clarinet does, so the harmonics don’t sound fused. There is much to be done for sure, but that you can kind-of-sort-of hear a resemblance to a clarinet, and that this was originated by a random set of sinusoids gives us a lot of hope. Also, we need a good resynthesis in a month and a half, because we have an official presentation coming up in August, and will need a paper by then.


Tuesday, May 16




Several things have come to pass. We presented the algorithm yesterday. The presentation assumed no knowledge of genetic algorithms going in, so it took a little while longer to explain. Also, We are in Japan right now, which speaks mostly in a non-English language, and this also takes some time to translate. The presentation is in powerpoint format, so you can download it if you are curious. Note the moniker “GeneSynth” and not “Fucking Sound,” Just to prove we can come up with clever titles. Of course we all know that if you use the word “Fucking” in a academic presentation, researchers will look the word up, realize what sex is, and then immediately turn roundabout, ceasing upon all intellectual pursuits for the newly discovered carnal quest, or at least think we are trying to be revolutionary for the wrong reasons, and we definitely can’t have the former. Anyway, GeneSynth is what the project has always been called in XCode, and we plan to put it up soon on SourceForge as an open source project, once things are working better.

Much more work has been done to the algorithm, but it is mostly bug fixing. The Fitness Functions have been implemented, using FFTs and RMS values. A bad gaussian noise unit gennerator has been implemented, yielding bad, but existing results. The good thing is that now we have enough to start fitting sounds to other sounds. The bad thing is that the algorithm needs to be fixed up to be able to process longer sounds (See the example below). Some work has already been done on this, making all audio buffers used by the GA requiring the use of a custom memory manager, which shares and recyles memory efficiently, but more is needed.

Examples:


Sound Fitting: mp3 format, 250k each, 20 secs

Target, pre-existing file we are trying to fit a bunch of cloth, guitar, piano, and cello samples against. Note the dynamic contour- the steady attacks in the first 10 seconds, followed by a pause and a very soft section which crecendos at the end.

Random Fitting, the first generation is created by making (30) random chromosomes and synthesizing each of them. This is the best one scored by our FFT fitness function. It is random in terms of the chromosome, but because we use pre-existing samples to form it, it has more interesting features than random samples (noise).

Fitting, 50 generations Later is what you get after 50 generations with our buggy and slow algorithm. Even so, you can vaugely hear and definitely see in an audio editor, the contour starts to match the shape of the target. At 10 seconds there is a clear drop off in volume that builds towards the end, just as the target has. Also, the frequency content is closer than the random sample – you hear the last chord in the first half being represented by higher pitched samples.

Monday, May 8th



Two major components have been implented – the mutation and crossover of chromosomes. There was a lot of bug hunting, and probably more of the critters will be revealed once logging is implemented, but right now it *sounds* like they are working. The mutation and crossover of the cromosomes we are using for sound synthesis is made more difficult than the easy peasy classic genetic algorithm because of our chromosome structure is complicated on several factors: it is of variable length, it is self referential, and it can recursively nest parts of itself. Now, some computer scientist might look at these words and scoff the back of their throat out into the open, because it seems like we may be taking a perfectly good simple algorithm and making it all complicated for no other purpose except to confuse and be able to use smart words as such, with our arbitrary ooh-thats-neat-self-referential-structure. And it would be fine because we really believe not enough people scoff energetically these days. But really we are doing it for a reason that we believe is correct. Of course we won’t know for a while. Faith is not just for Intelligent Designers, but also the ones that aren’t.

Examples:


Mutation: mp3 format, 3Mb, four minutes of audio.

First Mutation Chain is a chromosome that is mutated fifty times and played each time, showing the gradual mutation process. You can hear the pauses about every four seconds which signal the next mutation.


Crossover: WAV format, about 350k and 3 seconds each clip.
since crossover is the namesake of this website this is a pretty big deal. The crossover examples below use chromosomes with very small chromosome sizes so that you can hear how the next of kin inherits traits more clearly. Jane and John are the comsumators and Jack and Jill are the little bastards as we’ve yet to implement a Marrige() function for chromosome.

Jane and John (John is softer and has less events than Jane)

Jack and Jill

The results are easy to hear. The next step is to implement an FFT and RMS fitness evaluation function along with the selection function and we will be able to hear some sounds that aren’t just random sin and random morrisey. Tune in next time to see what Jack and Jill did.


Monday, April 24th




The first few chromosomes has been synthesized. They are just noise and created by random initialization, but these will be the dollar bills in plastic sleeves that are hung up long enough to be dried out until stained purple blobs appear from the one time the new dishwasher forgot to close the window when it was raining oh so hard.


three second clips: wav format, about 300k. Both have under 100 cell definitions and 200 cells per second.

Jane Number 1 is created with genes of file sound sources

Jane Number 2 is created with genes of sinusoidal synths

The file sound source based ones are pleasant because there aren’t many files (only two) to choose from in my library yet so you get a lot of echo/reverb like effect.

I like the sin based ones. It’s more pleasant than I expected from random init, I guess because each cell definition can be used many times to make a drone/chorus effect.