Frequency bands are often used in analysis or input representation, for example, in mel spectogram, there are a number of bands of differing frequency widths used to represent the signal. In synthesis, frequency bands are also used. However, synthesizing a frequency band of non zero width is usually a noisy process. The most common way to synthesize this is by generating wide band noise, either with a white noise generator or an impulse, and filtering it to have fewer frequencies. For example, with a band pass filter or high-Q IIR filter. When narrow enough these will resemble sinusoids with some attractive roughness. FM can be used to create narrow bands as well, and be made more complex with a daisy-chained FM. In the music page, the piece “December 9” uses daisy chained FM and granular synthesis exclusively to create tones and rain-like sounds. Each of these has some drawbacks and advantages depending on your use case – IIR might explode, daisy-chained FM is unstable in other ways. They are all pretty neat – somehow it’s enjoyable to turn noise into something that resembles a sinusoid.
I want to describe another technique to create banded noise which I’ve used before that does the inverse, turning a sinusoid into noise. I’m fairly sure others have used this as well, but it doesn’t seem to be well documented. The basic idea is to start with a sinusoidal unit generator and linearly interpolate the instantaneous frequency from the previous target frequency to a random target frequency, with the target updating at an average but random duration that is inversely proportional to the width. The phase advances monotonically each step. The result is a band of noise that is a perfect sinusoidal tone at zero width, and white noise at full width.
This technique was described in my first academic paper at ICMC 2007 as follows:
It might be interesting to create some DDSP modules that implement this for DNN context.
I wanted to write in-depth book reviews for books that I enjoyed in 2020 at the very start of the year. Well, it’s already a month and I haven’t gotten to it.
So to kickstart it, I’m just going to make the task easier and just list some of the books I read in 2020 that were noteworthy with one or two sentences. And to narrow it down further, I’m just going to talk about stats-related books, because I read a few of them.
This is a relentless, even thrilling debunking of bad science, starting with the replication crisis in psychology and ending up touching much more than I expected. The stories are captivating, and the explanations focus on the system’s perverse incentives for fraud, hype and negligence more than blaming the individuals (although there is definitely shaming where it is called for). The criticism of Kahneman’s overconfidence in Thinking Fast and Slow was refreshing to read, because he (and Yudkowsky in his senquences) says something to the effect of ‘you have no other choice to believe after reading these studies’, which felt like it didn’t match the larger message about questioning your beliefs and updating them on new information. It is a good lesson, not without a healthy helping of irony that rationalists need to be told that they should be less confident that they don’t know everything yet. Another great criticism was about the hugely popular, and often unchallenged Matthew Walker’s Why We Sleep, which I just took at face value before reading this book.
This is an excellent introduction to Bayesian statistics, and pairs wonderfully with the author’s engaging, enlightening, and entertaining recorded 2019 lectures on YouTube as well as the homework problems on GitHub. Miraculously, McElreath manages to pull off a new video phenomena that mixes statistics with hints of standup comedy. There are no mesmerizing 3blue1brown-like plots, but McElreath picks interesting problems and datasets to play with, and breaks down the model into the core components that a newcomer would need to understand it. I also appreciate that the course is designed for a wide range of people, so there are very little assumptions about math backgrounds, but if you know calculus, information theory, and linear algebra, there are nice little asides that go deeper. It’s also great how up-to-date the book is – I didn’t expect to be so interested by the developments in Hamiltonian Monte Carlo that have come in the past few years, but it seems the field is undergoing rapid development. This book also helped me with understanding casual inference and confounders and is a great follow-up to the casual-reader oriented Pearl book mentioned below.
This book was the first thing I read on causal inference and causality in general. It’s a light non-fiction book that assumes no math background, and does a good job of explaining how to answer the question of disentangling correlation and causation. The book has interesting problems and examples, such as how controlling on certain types of data (confounders) can actually cause you to find spurious causation, and how to go about showing that smoking really does cause cancer when it is unethical to do a randomized controlled trial and there are tons of confounders. There is a fair amount of interesting history in it, and because of that, there are traces of personal politics that seem slightly out of place, but it doesn’t detract from the book too much. This book is sort of a teaser for Pearl’s deeper textbook, Causality, which I haven’t read. The Book of Why doesn’t really go into how you would create a model of your own, or how such a bayesian model would compare to the deep neural network/stochastic gradient descent models that are driving the computing industry today. Perhaps Causality covers some of these things, but I felt like it would benefit from a few comparisons between the popular frameworks with a toy problem that deep learning can’t solve. Still, it could be argued that this was not the point of the book. In any case it was an enlightening introduction that gave me other questions to pursue, which is the type of book I am after.
Probability is a hard thing for humans to think about. The debate between Bayesian and orthodox (frequentist) statistics around the relationship of event frequency and probability makes that clear. Setting that aside, there are a whole bunch of fields that care about log probability. Log probability is an elemental quantity of information theory. Entropy is a measure of uncertainty, and at the core of information theory and data compression/coding. Expected log probability is entropy:
For a uniform distribution this can be simplified even further. A fair coin toss, or die throw, for example, has uniform probability of heads/tails or any number, and we get:
where for a coin flip and for a 8-sided die (because there are 2 and 8 possible values, respectively). So with a base-2 log, we get for the coin flip, and for a 8-sided die.
One thing that might not be immediately obvious is that this allows us to compare different types of events to each other. We can now compare the uncertainty in a coin flip and the uncertainty in a 8-sided die. so it takes 3 coin flips to have the same uncertainty. In fact, this means you could simulate an 8-sided die with 3 coin flips, but not 2 coin flips with some sort of tree structure: the first flip determines if the die is in 1-4 or 5-8, the next if it is in 1-2 or 3-4 if the first flip was heads, or 5-6 vs 7-8 for tails on the first flip, and the last flip resolves which of those two numbers the die ends up on.
You probably would have been able to come up with this scheme to simulate die throws from coinflips without knowing about how entropy is formulated. I find this interesting for a couple reasons. First, this means there may be something intuitive about entropy that we have in our brains that we can dig up. The second is that this gives us a formal way to verify and check what we intuited about randomness. For this post I want to focus on intuition.
The first time you are presented with entropy, you might wonder why we take the log of probability. That would be a funny thing to do without a reason. Why couldn’t I say, ‘take your logs and build a door and get out, I’ll just take the square or root instead and use that for my measure of uncertainty’, and continue with my life? It turns out there are reasons. I wanted to use this post to capture those reasons and the reference.
If you look at Shannon’s A Mathematical Theory of Communication, you will find a proof-based solution that’s quite nice. Even after looking at it if you haven’t looked at convexity-based proof in a while, it can still be somewhat unintuitive why there needs to be a logarithm involved. Here is a less formal and incomplete explanation that would have been useful for me to get more perspective on the problem.
There are a few desirable properties of entropy that Shannon describes. For example, entropy should be highest when all the events are equally likely. Another one of them is how combining independent events like coin flips or dice rolls combine the possibilities. This means that the number of outcomes is exponential on the number coin flips or die rolls. So if I compute the entropy of one coin flip and another coinflip, and add them together, the sum should be the same value if I were to compute a single entropy on those two coinflips together.
If you want a measure that grows linearly on number of coin flips or die rolls that achieves this property, then taking the logarithm of the number of combinations gives you just that. No other function that isn’t a logarithm will do that. This is because the number of possibilities of n coinflips is exponential. Notice that , where is the number of coin tosses and are the possible outcomes for coin tosses). So the log inverts the exponent added by the combinations of multiple events, which gives entropy the linear property on .
We accept that certain events that happened in the early course of our lives influence us with a permanence that lives on in our identity. For many people music will be one of those events. In the opening of High Fidelity the main character sarcastically warns about the dangers of music for kids. In this post I want to consider the after effects (such as nostalgia) for music after the novelty wears off. “Sentimental music has this great way of taking you back somewhere at the same time that it takes you forward, so you feel nostalgic and hopeful all at the same time.” — also from High Fidelity, and closer to today’s topic.
The older I get, the less surprises there seem to be. I probably should restate that. There are still many surprises, but the intensity and the excitement — the amount something grips me — seems to go down with age. This is a bit depressing if you look at it one way, but I don’t think it has to be. Rather it seems like it is a natural consequence of understanding more about the world. The most obvious analogue is film and literature – at first you can start with the classics and they are all amazing. Then eventually you get to a point where you can see how those classics influenced the newer works, getting more perspective and insight into the world. This is quite an interesting feeling too. But the novel spark of that first foray into the arts is really something special.
Where did those bright eyed, wonderful times go? I’m reminded of them most viscerally when I put on music. Sometimes I talk or write about experimental computer music. But the particular music that does this for me is only a few specific bands and genres that are now very far from me culturally – the teenage and college years of grunge, pop-punk, emo, and indie rock, from The Descendents to the popular garden state soundtrack. Near 40, I still feel a certain strong and invincible euphoria when listening to these genres, even though it’s been years since I listened. If I get into it and shake my head and air-drum it is even better. The music has wired up a strong pathway for belonging and excitement in my brain and then neurons were set to be read only, to last for much longer than I would have thought. When I am 80, will it still be there?
At the same time, the act of listening to this music feels more than a bit weird. Perhaps it feels masturbatory or escapist. There is a disconnect between who I am now and who I was then, and the music is an instant portal that revives those youthful brain pathways.
And if we take it one step further, where did those bright eyes and times of wonder come from? The act of falling in love with a (sub)-culture because of someone you loved is one of the slyest and most curious phenomenon that is near the heart of western individualism. The association with a beautiful face that you might get to see if you go to the next concert. Culture is much more than a sexually transmitted meme. The acceptance from dressing a certain way and having a crew is really powerful, and dancing in unison to a beat is a primal joy even if the footsteps are different. For me, it was always the elusive nature and the promise that glory was just around the corner. The Cat Power song goes “It must just be the colors and kids that keep me alive, because the music is boring me to death”, which is something that a popular artist might be able to say after overwhelming success. But I was always on the outskirts, never really feeling in the middle of the culture that I loved. The colors, the kids, they were great, but the thing the lasted for me was the music. And I think, just as with the friendship paradox, this is the case for most people.
What would it take to find something like that again? Is it a matter of age, hormones, and college or can we recreate that somehow? Would I even want to if I could? Why can’t I, say, take these nostalgic feelings from music that I grew out of and ‘install’ them into music and things I am interested in now?
This is related to rewriting your utility function, which is a very scary thing that will literally annihilate your identity if you get it wrong. Just as we are programmed through evolution to love doing certain things – to feel joy from the wind going through your hair on a good walk, we are programmed through society and culture to a lesser extent during our developmental youth. But we do not get to swap the way we feel about walking through nature and thinking about high dimensional spaces. The latter has an intellectual pleasure for sure, but it is less ‘natural’. If we could do this, the world would be very different in very little time. I am not sure if we would converge or diverge as a species, but I would posit that if you could make being good at your job feel like dancing for the average person, productivity would blow up globally and previous social problems would disappear in place of newer probably scarier problems. But this is a rabbit hole that has many tropes and discussions such as wireheading that I would like to leave aside for now. I just want to note that it seems that technology is still a ways off from this, but this feeling I get when I listen to the right music is a sneak preview. It is frightening and feels great at the same time.
This weekend I went back to basics and did some very tape-y music on the electro-acoustically traditional themes of traffic and water. That being said I left my usual mode of synthesis-based composition at the door: this is a memoir piece that I made in Audacity. I haven’t been making much music lately, so my goal was to have fun making something that I’ll enjoy listening to in a few hours using the recordings of rain, wind, and traffic that I’ve been collecting.
The Audacity of Nostalgia
The main traffic in rain was recorded outside of “The Curb” coffee shop in Kaimuki’s Waialae ave, the street where I spent most weekends of my youth hustling to sell magic cards to buy used video games in addition to 50 cent “ice cake” at the crack seed store.
Sitting at a fancy-but-gritty coffee shop that hadn’t existed a few years ago, recording the rain, I found myself comparing the smell of the rain and the air to the other places that I’ve recorded rain and similar relaxing noise in California.
Waialae Avenue is the main street that runs through Kaimuki in Oahu, where I grew up. Though it is constantly changing, I noticed that the initial scent rain there was markedly different, and possibly unique, evoking a strong nostalgia. Looking for an explanation, I discovered the phenomenon “Petrichor”, which involves accumlation of oils and bacteria in rock and stone during dry periods that are released via aerosols caused by raindrops. I became curious to know if the geographical and historical bacteria might cause this scent to have a unique signature.
From this line of thinking, I also became interested the acoustic signature of an area. I took recordings of mostly-water based sounds, from Waialae Avenue’s rain, the drizzle from a chanterelle hunt, Big Sur’s and waves, the Yuba River’s flow, and the airport that took me between these areas, and applied basic processing to emphasize their similarities and differences.
I wanted to combine some of these recordings in a piece that meshes them together, similar to how my consciousness hopped about from the roar of California’s Big Sur coast and the drizzle of my fall chanterelle hunts in the East Bay. There is also the motion of the Yuba river and wind of Ocean Beach in there. Also, I’ve enjoyed hearing “Kaze wo Atsumete” at SLC when taking a non-non-stop trip back from HNL to SFO, so it felt appropriate to put here.
On the subject of rain and computer music, I did get to train a tensorflow-wavenet model on my crappy hardware to get a rain-like signal out. It sounds somewhat like rain, but it turns out rain is similar to white noise anyway, so I probably should have started with something that had more tones in it. Hopefully I’ll probably write a post about using wavenet soon. Maybe after I get a 1080, so I won’t need to wait many days for a model.