Blogging

Two And A Half Fallacies (Statistics, Probability)

The field of statistics gives rise to a great number of fallacies (and intentional misuse for that matter). One of the most common is the Gambler’s Fallacy. It is the idea that an event can be “due” if it hasn’t appeared against all odds for quite some time.

In August 1913 an almost impossible string of events occurred in a casino in Monte Carlo. The roulette table showed black a record number of twenty-six times in a row. Since the chance for black on a single spin is about 0.474, the odds for this string are: 0.474^26 = 1 in about 270 million. For the casino, this was a lucky day. It profited greatly from players believing that once the table showed black several times in a row, the probability for another black to show up was impossibly slim. Red was due.

Unfortunately for the players, this logic failed. The chances for black remained at 0.474, no matter what colors appeared so far. Each spin is a complete reset of the game. The same goes for coins. No matter how many times a coin shows heads, the chance for this event will always stay 0.5. An unlikely string will not alter any probabilities if the events are truly independent.

Another common statistical fallacy is “correlation implies causation”. In countries with sound vaccination programmes, cancer rates are significantly elevated, whereas in countries where vaccination hardly takes place, there are only few people suffering from cancer. This seems to be a clear case against vaccination: it correlates with (and thus surely somehow must cause) cancer.

However, taking a third variable and additional knowledge about cancer into account produces a very different picture. Cancer is a disease of old age. Because it requires a string of undesired mutations to take place, it is usually not found in young people. It is thus clear that in countries with a higher life expectancy, you will find higher cancer rates. This increased life expectancy is reached via the many different tools of health care, vaccination being an important one of them. So vaccination leads to a higher life expectancy, which in turn leads to elevated rates in diseases of old age (among which is cancer). The real story behind the correlation turned out to be quite different from what could be expected at first.

Another interesting correlation was found by the parody religion FSM (Flying Spaghetti Monster). Deducting causation here would be madness. Over the 18th and 19th century, piracy, the one with the boats, not the one with the files and the sharing, slowly died out. At the same time, possibly within a natural trend and / or for reasons of increased industrial activity, the global temperature started increasing. If you plot the number of pirates and the global temperature in a coordinate system, you find a relatively strong correlation between the two. The more pirates there are, the colder the planet is. Here’s the corresponding formula:

T = 16 – 0.05 · P^0.33

with T being the average global temperature and P the number of pirates. Given enough pirates (about 3.3 million to be specific), we could even freeze Earth.

pirates global warming correlation flying spaghetti

But of course nobody in the right mind would see causality at work here, rather we have two processes, the disappearance of piracy and global warming, that happened to occur at the same time. So you shouldn’t be too surprised that the recent rise of piracy in Somalia didn’t do anything to stop global warming.

As we saw, a correlation between quantities can arise in many ways and does not always imply causation. Sometimes there is a third, unseen variable in the line of causation, other times it’s two completely independent processes happening at the same time. So be careful to draw your conclusions.

Though not a fallacy in the strict sense, combinations of low probability and a high number of trials are also a common cause for incorrect conclusions. We computed that in roulette the odds of showing black twenty-six times in a row are only 1 in 270 million. We might conclude that it is basically impossible for this to happen anywhere.

But considering there are something in the order of 3500 casinos worldwide, each playing roughly 100 rounds of roulette per day, we get about 130 million rounds per year. With this large number of trials, it would be foolish not to expect a 1 in 270 million event to occur every now and then. So when faced with a low probability for an event, always take a look at the number of trials. Maybe it’s not as unlikely to happen as suggested by the odds.

Five Biggest Mistakes In E-Book Publishing

Are you thinking about publishing an e-book? If yes, then know that you are entering a highly competetive market. Publishing a book has never been easier and accordingly, many new authors have joined in. To have a chance at being read, you need to make sure to avoid common mistakes.

1. Lack of Writing Experience

Almost everybody can read and write, it’s a skill we learn from an early age on. But writing is not the same as writing well. It takes a lot of practice to write articles or books that make a good read. So before you start that novel, grow a blog and gain experience. This provides a chance to see what works and what doesn’t. And the improvement will become noticeable after just a few weeks and months. As a plus, the blog you grew can serve as a marketing platform once your e-book is finished. In such a competitive market, this can be a big advantage.

2. Writing for Quick Cash

Writing for quick and easy cash is a really bad idea. This might have worked for a short while when the e-books were new and fresh, but this time is long gone. Just browse any indie author forum for proof. The market is saturated. If your first e-book brings in 30 $ a month or so, you can call yourself lucky. If it’s more, even better, but don’t expect it. Writing and selling e-book is not a get-rich-quick scheme. It’s tough work with a very low ROI. If you do it for the money, you’re in for a disappointment. Do it out of passion.

3. Lack of Editing

If you spend three weeks writing a book, expect to spend another three weeks on fine-tuning and proof-reading. To find the mistakes in the text, you have to go over it again and again until you can’t stand your book anymore. You’ll be amazed that seemingly obvious mistakes (the same words twice, for for example) can be overlooked several times. And no spell checker will find that. Tedious editing is just part of writing and if you try to skip that, you will end up with many deserved one-star reviews.

4. No or Ineffective Marketing

With 2.5 million e-books on Amazon, many of high quality, getting noticed is tough. Without any marketing, your sales will most likely just disappear in an exponential fashion over time. The common marketing means for indie authors are: growing a blog, establishing a facebook fan page, joining facebook groups and interacting, becoming active on twitter, joining goodreads and doing giveaways, free promos via KDP Select, banner and other paid ads (notably on BookBub – as expensive as it is effective), and and and … So you’re far from done with just writing, editing and publishing. You should set aside half an hour a day or so for marketing. And always make sure to market to the right people.

5. Stopping After The First Book

Publishing the first e-book can be a quite sobering experience. You just slaved for weeks or even months over your book and your stats hardly move. Was it all worth it? If you did it out of passion, then yes, certainly. But of course you want to be read and so you feel the frustration coming in. The worst thing you could do is to stop there. Usually sales will pick up after the third or fourth book. So keep publishing and results will come in.

Mathematics of Blog Traffic: Model and Tips for High Traffic

Over the last few days I finally did what I long had planned and worked out a mathematical model for blog traffic. Here are the results. First we’ll take a look at the most general form and then use it to derive a practical, easily applicable formula.

We need some quantities as inputs. The time (in days), starting from the first blog entry, is denoted by t. We number the blog posts with the variable k. So k = 1 refers to the first post published, k = 2 to the second, etc … We’ll refer to the day on which entry k is published by t(k).

The initial number of visits entry k draws from the feed is symbolized by i(k), the average number of views per day entry k draws from search engines by s(k). Assuming that the number of feed views declines exponentially for each article with a factor b (my observations put the value for this at around 0.4 – 0.6), this is the number of views V the blog receives on day t:

V(t) = Σ[k] ( s(k) + i(k) · bt – t(k))

Σ[k] means that we sum over all k. This is the most general form. For it to be of any practical use, we need to make simplifying assumptions. We assume that the entries are published at a constant frequency f (entries per day) and that each article has the same popularity, that is:

i(k) = i = const.
s(k) = s = const.

After a long calculation you can arrive at this formula. It provides the expected number of daily views given that the above assumptions hold true and that the blog consists of n entries in total:

V = s · n + i / ( 1 – b1/f )

Note that according to this formula, blog traffic increases linearly with the number of entries published. Let’s apply the formula. Assume we publish articles at a frequency f = 1 per day and they draw i = 5 views on the first day from the feed and s = 0.1 views per day from search engines. With b = 0.5, this leads to:

V = 0.1 · n + 10

So once we gathered n = 20 entries with this setup, we can expect V = 12 views per day, at n = 40 entries this grows to V = 14 views per day, etc … The theoretical growth of this blog with number of entries is shown below:

viewsentries

How does the frequency at which entries are being published affect the number of views? You can see this dependency in the graph below (I set n = 40):

viewsfrequency

The formula is very clear about what to do for higher traffic: get more attention in the feed (good titles, good tagging and a large number of followers all lead to high i and possibly reduced b), optimize the entries for search engines (high s), publish at high frequency (obviously high f) and do this for a long time (high n).

We’ll draw two more conclusions. As you can see the formula neatly separates the search engine traffic (left term) and feed traffic (right term). And while the feed traffic reaches a constant level after a while of constant publishing, it is the search engine traffic that keeps on growing. At a critical number of entries N, the search engine traffic will overtake the feed traffic:

N = i / ( s · ( 1 – b1/f ) )

In the above blog setup, this happens at N = 100 entries. At this point both the search engines as well as the feed will provide 10 views per day.

Here’s one more conclusion: the daily increase in the average number of views is just the product of the daily search engine views per entry s and the publishing frequency f:

V / t = s · f

Thus, our example blog will experience an increase of 0.1 · 1 = 0.1 views per day or 1 additional view per 10 days. If we publish entries at twice the frequency, the blog would grow with 0.1 · 2 = 0.2 views per day or 1 additional view every 5 days.

Analysis: Size and Loading Times of WordPress.com Blogs

In the fast paced online world people are not so patient as in real life. Accordingly, having a large home page size and loading time can negatively affect your blog traffic. Studies have shown that the greater the loading time, the higher the bounce rate. To find out how well my blog performs with respect to this (feel free to use the results for your benefits as well), I did a analysis of 70 WordPress.com blogs. I used iWEBTOOLS’s Website Speed Test and OriginPro for that. With the tool you can analyze ten webpages at once, but note that after ten queries you have to wait a full day (not an hour as the website claims) to do more analysis.

The average size of a WordPress.com blog according to the analysis is 65.3 KB with a standard error SE = 3.0 KB. Here’s how the size is distributed:

WPSize

The average loading time at my internet speed (circa 600 KB/s) is 0.66 s with the standard error SE = 0.10 s. Here’s the corresponding distribution:

WPLoading2

Note that the graph obviously depends on your internet speed. If you have faster internet, the whole distribution will shift to the left. My blog has a home page size of 81.6 KB. From the first graph I can deduce that only about 24 % of home pages are larger in size. My loading time is 0.86 s, here only about 22 % top that. So it looks like I really have to throw off some weight.

Here’s the loading time plotted against the home page size:

WPLoadingSize

In a very rough approximation we have the relation:

loading time = 0.009 * size

In other words: getting rid of 10 KB should lower the loading time by about 0.1 seconds. Now feel free to check your own blog and see where it fits in. If you got the time, post your results (if possible including URL, size, loading time, internet speed) in the comments. I’d greatly appreciate the additional data. For a reliable result regarding loading time it’s best to check the same page three times and do the average.

Increase Views per Visit by Linking Within your Blog

One of the most basic and useful performance indicator for blogs is the average number of views per visit. If it is high, that means visitors stick around to explore the blog after reading a post. They value the blog for being well-written and informative. But in the fast paced, content saturated online world, achieving a lot of views per visit is not easy.

You can help out a little by making exploring your blog easier for readers. A good way to do this is to link within your blog, that is, to provide internal links. Keep in mind though that random links won’t help much. If you link one of your blog post to another, they should be connected in a meaningful way, for example by covering the same topic or giving relevant additional information to what a visitor just read.

Being mathematically curious, I wanted to find a way to judge what impact such internal links have on the overall views per visit. Assume you start with no internal links and observe a current number views per visitor of x. Now you add n internal links in your blog, which has in total a number of m entries. Given that the probability for a visitor to make use of an internal link is p, what will the overall number of views per visit change to? Yesterday night I derived a formula for that:

x’ = x + (n / m) · (1 / (1-p) – 1)

For example, my blog (which has as of now very few internal links) has an average of x = 2.3 views per visit and m = 42 entries. If I were to add n = 30 internal links and assuming a reader makes use of an internal link with the probability p = 20 % = 0.2, this should theoretically change into:

x’ = 2.3 + (30 / 42) · (1 / 0.8 – 1) = 2.5 views per visit

A solid 9 % increase in views per visit and this just by providing visitors a simple way to explore. So make sure to go over your blog and connect articles that are relevant to each other. The higher the relevancy of the links, the higher the probability that readers will end up using them. For example, if I only added n = 10 internal links instead of thirty, but had them at such a level of relevancy that the probability of them being used increases to p = 40 % = 0.4, I would end up with the same overall views per visit:

x’ = 2.3 + (10 / 42) · (1 / 0.6 – 1) = 2.5 views per visit

So it’s about relevancy as much as it is about amount. And in the spirit of not spamming, I’d prefer adding a few high-relevancy internal links that a lot low-relevancy ones.

If you’d like to know more on how to optimize your blog, check out: Setting the Order for your WordPress Blog Posts and Keywords: How To Use Them Properly On a Website or Blog.

Setting the order for your wordpress blog posts

Usually your blog entries are ordered according to the date on which they were published. You can however also order them according to your wishes by altering the time tag. It’s a good idea to bump posts which have proven to be popular among readers to the top of the page from time to time. Note that the posts will not appear as newly published in the feed, but they will be the first thing a reader sees when he or she clicks on your blog’s title.

Here’s how to bump a post to the top of the page:

1. Go to the most current blog entry and click on edit. Under the section “Publish” (at the top, on the right) you will find the time tag. Note this time.

2. Go to the post to be bumped and click on edit. Again look for the time tag under the section publish and edit it. For it to appear at the top, the time tag must show a later time than the current number one.

I sometimes put this post (Mach Cone) at the top of the page because readers seem to enjoy the picture in it (and because I love it as well, it is physics and math come to life and in action).

While we’re at it, check out my sidebar for the “Posts I like” widget. Add it to your blog as well if you like to help out bloggers who have created great content.

Article Marketing Myths And Facts

By now everyone has heard of article marketing and so many people out define it in so many different ways there that it has become hard for people new to article marketing to understand.

In general, article marketing is where you write an article on a topic that is related to your website topic. Not a promotional article for your website, but an article about something that is informative to the reader. In the article you use keywords and phrases that relate to your topic as well, much like you would optimize a webpage. Your article when reprinted will be the text of a webpage or webpages.

In the author bio section at the bottom is some info about you and links to your website. It is suggested that you put in one link to your main page and one to an interior page that fits the article you are writing.

If your article is submitted to websites that take article submissions and offers free content to webmasters, then webmasters choose to repost your article on their websites, the links in the author bio section become links from their websites to your website.

Now lets go on to the myths and facts about article marketing.

MYTH: Article marketing doesn’t really help you all that much.

FACT: Article Marketing can help you increase your link popularity and be a source of some of the most targeted traffic you can get.

MYTH: Reprinted Articles only get indexed as supplemental pages, therefore it doesn’t help enough to make it worthwhile.

FACT: Depending on where the article gets submitted to, the article itself can get a top 10 listing in major search engines and not as a supplemental page.

MYTH: Submitting your article everywhere creates duplicate content and the search engines will punish or discount those pages as a result.

FACT: If search engines punished duplicate content in the way that myth suggests then all rss feeds that cause a post in a blog to be reproduced to be discounted or published and they are not. The New York Times articles and CNN stuff is blasted all over the web and are not punished or discounted.

Duplicate content is two webpages that are around 70% similar, not two webpages that have similar text on them.

MYTH: The only way article marketing works is you write an article then submit it to thousands of article submission websites.

FACT: There is more than one way to make article marketing work for you. The way mentioned above works okay if you are looking to get a lot of links back to your website whether they are related or not and can be effective if you currently have very little or no link popularity at all.

Another way is to hand submit your article to article submission websites that only accept articles related to your topic. This is more difficult but the links help you more just through the submissions and it’s more likely that the websites that pick up and repost your article will be also related to your topic which can help you with better links and targeted traffic.

Yet another way is to write a very high quality article that you really take your time on and research. You then choose a very high traffic website related to your topic. One that has great PR and a lot of visitors.

Email them your article and offer them an exclusive if they will print your article with your links included in the bio. If your article is of good quality and they get an exclusive you have a good chance they will post your article there.

This one posting of your article can be more powerful than the mass submitted article method if you choose the website you submit it to carefully.

Last but not least, posting your article exclusively on your own website is a great way to add fresh content and if the article is good, people will link directly to the article increasing both traffic and PR for your webpage where you posted the article. But for this to work you need to already have some traffic to work with.

MYTH: You should always post your article in your website first, then wait to get crawled by the search engines before submitting the article elsewhere.

FACT: Adding articles to your own website is called adding content. Submitting those articles to other websites is called article marketing. With article marketing you don’t want the article indexed on your website first.

Yes you read that right. You do not want the article indexed on your website first. You are or should already be doing SEO on your website and adding fresh content to your website for the search engines to get traffic from them.

Submitting articles to other websites and having the search engines find it there first gives another gateway that people can find your website through.

If the websites that you submitted your articles to get crawled often, then having your article appear there with the links intact will get your website crawled as well.

If the websites you submitted your article to are getting indexed well by the search engines, then your article being found on their website first might get it in the top 10 results.

Placing it into your own website with no or low PR might not have gotten the article indexed at all.

I hope this article will clear up some of the myths about article marketing and that it has helped you understand how and why it works.