books

Mathematical Model For (E-) Book Sales

It seems to be a no-brainer that with more books on the market, an author will see higher revenues. I wanted to know more about how the sales rate varies with the number of books. So I did what I always do when faced with an economic problem: construct a mathematical model. Even though it took me several tries to find the right approach, I’m fairly confident that the following model is able to explain why revenues grow overproportionally with the number of books an author has published. I also stumbled across a way to correct the marketing R/C for number of books.

The basic quantities used are:

  • n = number of books
  • i = impressions per day
  • q = conversion probability (which is the probability that an impression results in a sale)
  • s = sales per buyer
  • r = daily sales rate

Obviously the basic relationship is:

r = i(n) * q(n) * s(n)

with the brackets indicating a dependence of the quantities on the number of books.

1) Let’s start with s(n) = sales per buyer. Suppose there’s a probability p that a buyer, who has purchased an author’s book, will go on to buy yet another book of said author. To visualize this, think of the books as some kind of mirrors: each ray (sale) will either go through the book (no further sales from this buyer) or be reflected on another book of the author. In the latter case, the process repeats. Using this “reflective model”, the number of sales per buyer is:

s(n) = 1 + p + p² + … + pn = (1 – pn) / (1 – p)

For example, if the probability of a reader buying another book from the same author is p = 15 % = 0.15 and the author has n = 3 books available, we get:

s(3) = (1 – 0.153) / (1 – 0.15) = 1.17 sales per buyer

So the number of sales per buyer increases with the number of books. However, it quickly reaches a limiting value. Letting n go to infinity results in:

s(∞) = 1 / (1 – p)

Hence, this effect is a source for overproportional growth only for the first few books. After that it turns into a constant factor.

2) Let’s turn to q(n) = conversion probability. Why should there be a dependence on number of books at all for this quantity? Studies show that the probability of making a sale grows with the choice offered. That’s why ridiculously large malls work. When an author offers a large number of books, he is able to provide list impression (featuring all his / her books) additionally to the common single impressions (featuring only one book). With more choice, the conversion probability on list impressions will be higher than that on single impressions.

  • qs = single impression conversion probability
  • ps = percentage of impressions that are single impressions
  • ql = list impression conversion probability
  • pl = percentage of impressions that are list impressions

with ps + pl = 1. The overall conversion probability will be:

q(n) = qs(n) * ps(n) + ql(n)* pl(n)

With ql(n) and pl(n) obviously growing with the number of books and ps(n) decreasing accordingly, we get an increase in the overall conversion probability.

3) Finally let’s look at i(n) = impressions per day. Denoting with i1, i2, … the number of daily impressions by book number 1, book number 2, … , the average number of impressions per day and book are:

ib = 1/n * ∑[k] ik

with ∑[k] meaning the sum over all k. The overall impressions per day are:

i(n) = ib(n) * n

Assuming all books generate the same number of daily impressions, this is a linear growth. However, there might be an overproportional factor at work here. As an author keeps publishing, his experience in writing, editing and marketing will grow. Especially for initially inexperienced authors the quality of the books and the marketing approach will improve with each book. Translated in numbers, this means that later books will generate more impressions per day:

ik+1 > ik

which leads to an overproportional (instead of just linear) growth in overall impressions per day with the number of books. Note that more experience should also translate into a higher single impression conversion probability:

qs(n+1) > qs(n)

4) As a final treat, let’s look at how these effects impact the marketing R/C. The marketing R/C is the ratio of revenues that result from an ad divided by the costs of the ad:

R/C = Revenues / Costs

For an ad to be of worth to an author, this value should be greater than 1. Assume an ad generates the number of iad single impressions in total. For one book we get the revenues:

R = iad * qs(1)

If more than one book is available, this number changes to:

R = iad * qs(n) * (1 – pn) / (1 – p)

So if the R/C in the case of one book is (R/C)1, the corrected R/C for a larger number of books is:

R/C = (R/C)1 * qs(n) / qs(1) * (1 – pn) / (1 – p)

In short: ads, that aren’t profitable, can become profitable as the author offers more books.

For more mathematical modeling check out: Mathematics of Blog Traffic: Model and Tips for High Traffic.

Intensity: How Much Power Will Burst Your Eardrums?

Under ideal circumstances, sound or light waves emitted from a point source propagate in a spherical fashion from the source. As the distance to the source grows, the energy of the waves is spread over a larger area and thus the perceived intensity decreases. We’ll take a look at the formula that allows us to compute the intensity at any distance from a source.

Great Formulas_html_7230225e

First of all, what do we mean by intensity? The intensity I tells us how much energy we receive from the source per second and per square meter. Accordingly, it is measured in the unit J per s and m² or simply W/m². To calculate it properly we need the power of the source P (in W) and the distance r (in m) to it.

I = P / (4 · π · r²)

This is one of these formulas that can quickly get you hooked on physics. It’s simple and extremely useful. In a later section you will meet the denominator again. It is the expression for the surface area of a sphere with radius r.

Before we go to the examples, let’s take a look at a special intensity scale that is often used in acoustics. Instead of expressing the sound intensity in the common physical unit W/m², we convert it to its decibel value dB using this formula:

dB ≈ 120 + 4.34 · ln(I)

with ln being the natural logarithm. For example, a sound intensity of I = 0.00001 W/m² (busy traffic) translates into 70 dB. This conversion is done to avoid dealing with very small or large numbers. Here are some typical values to keep in mind:

0 dB → Threshold of Hearing
20 dB → Whispering
60 dB → Normal Conversation
80 dB → Vacuum Cleaner
110 dB → Front Row at Rock Concert
130 dB → Threshold of Pain
160 dB → Bursting Eardrums

No onto the examples.

———————-

We just bought a P = 300 W speaker and want to try it out at maximal power. To get the full dose, we sit at a distance of only r = 1 m. Is that a bad idea? To find out, let’s calculate the intensity at this distance and the matching decibel value.

I = 300 W / (4 · π · (1 m)²) ≈ 23.9 W/m²

dB ≈ 120 + 4.34 · ln(23.9) ≈ 134 dB

This is already past the threshold of pain, so yes, it is a bad idea. But on the bright side, there’s no danger of the eardrums bursting. So it shouldn’t be dangerous to your health as long as you’re not exposed to this intensity for a longer period of time.

As a side note: the speaker is of course no point source, so all these values are just estimates founded on the idea that as long as you’re not too close to a source, it can be regarded as a point source in good approximation. The more the source resembles a point source and the farther you’re from it, the better the estimates computed using the formula will be.

———————-

Let’s reverse the situation from the previous example. Again we assume a distance of r = 1 m from the speaker. At what power P would our eardrums burst? Have a guess before reading on.

As we can see from the table, this happens at 160 dB. To be able to use the intensity formula, we need to know the corresponding intensity in the common physical quantity W/m². We can find that out using this equation:

160 ≈ 120 + 4.34 · ln(I)

We’ll subtract 120 from both sides and divide by 4.34:

40 ≈ 4.34 · ln(I)   

9.22 ≈ ln(I)

The inverse of the natural logarithm ln is Euler’s number e. In other words: e to the power of ln(I) is just I. So in order to get rid of the natural logarithm in this equation, we’ll just use Euler’s number as the basis on both sides:

e^9.22 ≈ e^ln(I)

10,100 ≈ I

Thus, 160 dB correspond to I = 10,100 W/m². At this intensity eardrums will burst. Now we can answer the question of which amount of power P will do that, given that we are only r = 1 m from the sound source. We insert the values into the intensity formula and solve for P:

10,100 = P / (4 · π · 1²)

10,100 = 0.08 · P

P ≈ 126,000 W

So don’t worry about ever bursting your eardrums with a speaker or a set of speakers. Not even the powerful sound systems at rock concerts could accomplish this.

———————-

This was an excerpt from the ebook “Great Formulas Explained – Physics, Mathematics, Economics”, released yesterday and available here: http://www.amazon.com/dp/B00G807Y00.

Physics (And The Formula That Got Me Hooked)

A long time ago, in my teen years, this was the formula that got me hooked on physics. Why? I can’t say for sure. I guess I was very surprised that you could calculate something like this so easily. So with some nostalgia, I present another great formula from the field of physics. It will be a continuation of and a last section on energy.

To heat something, you need a certain amount of energy E (in J). How much exactly? To compute this we require three inputs: the mass m (in kg) of the object we want to heat, the temperature difference T (in °C) between initial and final state and the so called specific heat c (in J per kg °C) of the material that is heated. The relationship is quite simple:

E = c · m · T

If you double any of the input quantities, the energy required for heating will double as well. A very helpful addition to problems involving heating is this formula:

E = P · t

with P (in watt = W = J/s) being the power of the device that delivers heat and t (in s) the duration of the heat delivery.

———————

The specific heat of water is c = 4200 J per kg °C. How much energy do you need to heat m = 1 kg of water from room temperature (20 °C) to its boiling point (100 °C)? Note that the temperature difference between initial and final state is T = 80 °C. So we have all the quantities we need.

E = 4200 · 1 · 80 = 336,000 J

Additional question: How long will it take a water heater with an output of 2000 W to accomplish this? Let’s set up an equation for this using the second formula:

336,000 = 2000 · t

t ≈ 168 s ≈ 3 minutes

———————-

We put m = 1 kg of water (c = 4200 J per kg °C) in one container and m = 1 kg of sand (c = 290 J per kg °C) in another next to it. This will serve as an artificial beach. Using a heater we add 10,000 J of heat to each container. By what temperature will the water and the sand be raised?

Let’s turn to the water. From the given data and the great formula we can set up this equation:

10,000 = 4200 · 1 · T

T ≈ 2.4 °C

So the water temperature will be raised by 2.4 °C. What about the sand? It also receives 10,000 J.

10,000 = 290 · 1 · T

T ≈ 34.5 °C

So sand (or any ground in general) will heat up much stronger than water. In other words: the temperature of ground reacts quite strongly to changes in energy input while water is rather sluggish. This explains why the climate near oceans is milder than inland, that is, why the summers are less hot and the winters less cold. The water efficiently dampens the changes in temperature.

It also explains the land-sea-breeze phenomenon (seen in the image below). During the day, the sun’s energy will cause the ground to be hotter than the water. The air above the ground rises, leading to cooler air flowing from the ocean to the land. At night, due to the lack of the sun’s power, the situation reverses. The ground cools off quickly and now it’s the air above the water that rises.

Image
———————-

I hope this formula got you hooked as well. It’s simple, useful and can explain quite a lot of physics at the same time. It doesn’t get any better than this. Now it’s time to leave the concept of energy and turn to other topics.

This was an excerpt from my Kindle ebook: Great Formulas Explained – Physics, Mathematics, Economics. For another interesting physics quicky, check out: Intensity (or: How Much Power Will Burst Your Eardrums?).

Physics: Free Fall and Terminal Velocity

After a while of free fall, any object will reach and maintain a terminal velocity. To calculate it, we need a lot of inputs.

The necessary quantities are: the mass of the object (in kg), the gravitational acceleration (in m/s²), the density of air D (in kg/m³), the projected area of the object A (in m²) and the drag coefficient c (dimensionless). The latter two quantities need some explaining.

The projected area is the largest cross-section in the direction of fall. You can think of it as the shadow of the object on the ground when the sun’s rays hit the ground at a ninety degree angle. For example, if the falling object is a sphere, the projected area will be a circle with the same radius.

The drag coefficient is a dimensionless number that depends in a very complex way on the geometry of the object. There’s no simple way to compute it, usually it is determined in a wind tunnel. However, you can find the drag coefficients for common shapes in the picture below.

Now that we know all the inputs, let’s look at the formula for the terminal velocity v (in m/s). It will be valid for objects dropped from such a great heights that they manage to reach this limiting value, which is basically a result of the air resistance canceling out gravity.

v = sq root (2 * m * g / (c * D * A) )

Let’s do an example.

Skydivers are in free fall after leaving the plane, but soon reach the terminal velocity. We will set the mass to m = 75 kg, g = 9.81 (as usual) and D = 1.2 kg/m³. In a head-first position the skydiver has a drag coefficient of c = 0.8 and a projected area A = 0.3 m². What is the terminal velocity of the skydiver?

v = sq root (2 * 75 * 9.81 / (0.8 * 1.2 * 0.3) )

v ≈ 70 m/s ≈ 260 km/h ≈ 160 mph

Let’s take a look how changing the inputs varies the terminal velocity. Two bullet points will be sufficient here:

  • If you quadruple the mass (or the gravitational acceleration), the terminal velocity doubles. So a very heavy skydiver or a regular skydiver on a massive planet would fall much faster.
  • If you quadruple the drag coefficient (or the density or the projected area), the terminal velocity halves. This is why parachutes work. They have a higher drag coefficient and larger area, thus effectively reducing the terminal velocity.

This was an excerpt from the Kindle ebook: Great Formulas Explained – Physics. Mathematics, Economics. Check out my BEST OF for more interesting physics articles.

Missile Accuracy (CEP) – Excerpt from “Statistical Snacks”

An important quantity when comparing missiles is the CEP (Circular Error Probable). It is defined as the radius of the circle in which 50 % of the fired missiles land. The smaller it is, the better the accuracy of the missile. The German V2 rockets for example had a CEP of about 17 km. So there was a 50/50 chance of a V2 landing within 17 km of its target. Targeting smaller cities or even complexes was next to impossible with this accuracy, one could only aim for a general area in which it would land rather randomly.

Today’s missiles are significantly more accurate. The latest version of China’s DF-21 has a CEP about 40 m, allowing the accurate targeting of small complexes or large buildings, while CEP of the American made Hellfire is as low as 4 m, enabling precision strikes on small buildings or even tanks.

Assuming the impacts are normally distributed, one can derive a formula for the probability of striking a circular target of Radius R using a missile with a given CEP:

p = 1 – exp( -0.41 · R² / CEP² )

This quantity is also called the “single shot kill probability” (SSKP). Let’s include some numerical values. Assume a small complex with the dimensions 100 m by 100 m is targeted with a missile having a CEP of 150 m. Converting the rectangular area into a circle of equal area gives us a radius of about 56 m. Thus the SSKP is:

p = 1 – exp( -0.41 · 56² / 150² ) = 0.056 = 5.6 %

So the chances of hitting the target are relatively low. But the lack in accuracy can be compensated by firing several missiles in succession. What is the chance of at least one missile hitting the target if ten missiles are fired? First we look at the odds of all missiles missing the target and answer the question from that. One missile misses with 0.944 probability, the chance of having this event occur ten times in a row is:

p(all miss) = 0.94410 = 0.562

Thus the chance of at least one hit is:

p(at least one hit) = 1 – 0.562 = 0.438 = 43.8 %

Still not great considering that a single missile easily costs 10000 $ upwards. How many missiles of this kind must be fired at the complex to have a 90 % chance at a hit? A 90 % chance at a hit means that the chance of all missiles missing is 10 %. So we can turn the above formula for p(all miss) into an equation by inserting p(all miss) = 0.1 and leaving the number of missiles n undetermined:

0.1 = 0.944n

All that’s left is doing the algebra. Applying the natural logarithm to both sides and solving for n results in:

n = ln(0.1) / ln(0.944) = 40

So forty missiles with a CEP of 150 m are required to have a 90 % chance at hitting the complex. As you can verify by doing the appropriate calculations, three DF-21 missiles would have achieved the same result.

Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2. For more excerpts see The Probability of Becoming a Homicide Victim and How To Use the Expected Value.

My Fair Game – How To Use the Expected Value

You meet a nice man on the street offering you a game of dice. For a wager of just 2 $, you can win 8 $ when the dice shows a six. Sounds good? Let’s say you join in and play 30 rounds. What will be your expected balance after that?

You roll a six with the probability p = 1/6. So of the 30 rounds, you can expect to win 1/6 · 30 = 5, resulting in a pay-out of 40 $. But winning 5 rounds of course also means that you lost the remaining 25 rounds, resulting in a loss of 50 $. Your expected balance after 30 rounds is thus -10 $. Or in other words: for the player this game results in a loss of 1/3 $ per round.

 Let’s make a general formula for just this case. We are offered a game which we win with a probability of p. The pay-out in case of victory is P, the wager is W. We play this game for a number of n rounds.

The expected number of wins is p·n, so the total pay-out will be: p·n·P. The expected number of losses is (1-p)·n, so we will most likely lose this amount of money: (1-p)·n·W.

 Now we can set up the formula for the balance. We simply subtract the losses from the pay-out. But while we’re at it, let’s divide both sides by n to get the balance per round. It already includes all the information we need and requires one less variable.

B = p · P – (1-p) · W

This is what we can expect to win (or lose) per round. Let’s check it by using the above example. We had the winning chance p = 1/6, the pay-out P = 8 $ and the wager W = 2 $. So from the formula we get this balance per round:

B = 1/6 · 8 $ – 5/6 · 2 $ = – 1/3 $ per round

Just as we expected. Let’s try another example. I’ll offer you a dice game. If you roll two six in a row, you get P = 175 $. The wager is W = 5 $. Quite the deal, isn’t it? Let’s see. Rolling two six in a row occurs with a probability of p = 1/36. So the expected balance per round is:

B = 1/36 · 175 $ – 35/36 · 5 $ = 0 $ per round

I offered you a truly fair game. No one can be expected to lose in the long run. Of course if we only play a few rounds, somebody will win and somebody will lose.

It’s helpful to understand this balance as being sound for a large number of rounds but rather fragile in case of playing only a few rounds. Casinos are host to thousands of rounds per day and thus can predict their gains quite accurately from the balance per round. After a lot of rounds, all the random streaks and significant one-time events hardly impact the total balance anymore. The real balance will converge to the theoretical balance more and more as the number of rounds grows. This is mathematically proven by the Law of Large Numbers. Assuming finite variance, the proof can be done elegantly using Chebyshev’s Inequality.

The convergence can be easily demonstrated using a computer simulation. We will let the computer, equipped with random numbers, run our dice game for 2000 rounds. After each round the computer calculates the balance per round so far. The below picture shows the difference between the simulated balance per round and our theoretical result of – 1/3 $ per round.

Image

(Liked the excerpt? Get the book “Statistical Snacks” by Metin Bektas here: http://www.amazon.com/Statistical-Snacks-ebook/dp/B00DWJZ9Z2)