What's a Markov Process?


A Markov analysis looks at a sequence of events, and analyzes the tendency of one event to be followed by another. Using this analysis, you can generate a new sequence of random but related events, which will look similar to the original.

A Markov process is useful for analyzing dependent random events - that is, events whose likelihood depends on what happened last. It would NOT be a good way to model a coin flip, for example, since every time you toss the coin, it has no memory of what happened before. The sequence of heads and tails are not inter-related. They are independent events.

But many random events are affected by what happened before. For example, yesterday's weather does have an influence on what today's weather is. They are not independent events.

A Markov model could look at a long sequence of rainy and sunny days, and analyze the likelihood that one kind of weather gets followed by another kind. Let's say it was found that 25% of the time, a rainy day was followed by a sunny day, and 75% of the time, rain was followed by more rain. Let's say we found out additionally, that sunny days were followed 50% of the time by rain, and 50% by sun. Given this analysis, we could generate a new sequence of statistically similar weather by following these steps:

1) Start with today's weather.
2) Given today's weather, choose a random number to pick tomorrow's weather.
3) Make tomorrow's weather "today's weather" and go back to step 2.

What we'd get is a sequence of days like:
Sunny Sunny Rainy Rainy Rainy Rainy Sunny Rainy Rainy Sunny Sunny...

In other words, the "output chain" would reflect statistically the transition probabilities derived from weather we observed.

This stream of events is called a Markov Chain. A Markov Chain, while similar to the source in the small, is often nonsensical in the large. (Which is why it's a lousy way to predict weather.) That is, the overall shape of the generated material will bear little formal resemblance to the overall shape of the source. But taken a few events at a time, things feel familiar.

Doctor Nerve's Markov program uses English words instead of weather, yielding odd scrambled results that sound like very disturbed English. The program looks at the likelihood that every word you typed in gets followed by another word. After it generates all these transition probabilities, it starts with the first word you typed and starts generating random numbers to pick the sequence of words following it. You will always get pairs of words that occurred in the original. But looking at three or more words at a time starts to look pretty bizarre.

The results are like dreamy crossed memories of the original text whose phrases branch off and meander in simultaneously unexpected and uncomfortably familiar ways.

For example, if you entered: "the boy and the dog went to the park."

The Markov chain might output: "the boy and the boy and the dog went to the boy"

...note that the word "the" is the statistical pivot here - it was followed by "boy", "dog", and "park", so there's a 33.3% chance that one of these words will follow "the" every time the Markov Chain picks "the".

The Markov Chain would NEVER output: "boy the park."

...because "boy" was never followed by "the" in the original story.

Get it?

For more info about Markov Chains, refer to Scientific American June 89, AK Dewdney's Computer Recreations column. Also, the book "Computer Music - Synthesis, Composition, and Performance" by Dodge/Jerse (Schirmer Books) is an excellent source of not only Markov analysis techniques, but other statistical methods applicable to music as well.

Markov processes have been used to generate music as early as the 1950's by Harry F. Olson at Bell Labs. Olson used them to analyse the music of American composer Stephen Foster, and generate scores based on the analyses of 11 of Foster's songs. Lejaren Hiller and Robert Baker also worked with Markov processes to produce their "Computer Cantata" in 1963. You can use Nerve's Markov program to do this by typing in pitches and durations instead of words. For example, C#Q DQ C#Q DS EE. could stand for the following melody: C# quarter note, D quarter note, C# quarter note, D sixteenth note, E dotted eighth note. Use whatever notation you want, just stay consistent - Markov will happily analyze your input and generate statistically similar output.

-- Nick Didkovsky, Feb 16, 1996

Back to Nerve Interactive