US Elections

Thought I would do a series of wordclouds and sentiment graphs each day from Australia using data from twitter.

Basic searchkey’s are “trump election” and then “biden election”, very simple and some overlap. 5000 tweets are sampled.

The wordcloud is formed from taking counts of the most groups of 5 words.

The sentiment graph is taken using a method that attempts to account for the use of negation eg not like is considered negative but like is positive.

22/10/2020 midday EST Australia

“trump election”


The overall sentiment is negative, less than zero on LHS.

5 word phases wordcloud

“biden election”


More balanced postive and negative around 0


Comparison of Biden and Trump tweets

The report on this research can be found at:

The two most interesting visualization are the wordclouds of three words phrases. Biden’s is first and then Trump’s.

Sentiment analysis showed only minor differences. However, the wordcloud’s show very different focus for both. Biden’s is characterised by more positive statements about issues and the need to defeat Trump. Trump’s on the other hand shows a strong focus on his slogans e.g. Corrupt Joe Biden.

Covid-19 US Race Investigation

The covid-19 death rate in the US is shown to have a statistically significant difference between between Afro-American (AA) and White (W) populations. The graph below shows the AA proportion difference compared to the W proportion below the x=0 line. There is a clear difference across nearly all states.

The full outline, R program and data file are available at ameyenn/covid19

The PCA biplot confirms the difference. The vectors show a negative relationship between AA and W, AA is linked with the poverty vector as well. NY is the main outlier.

A basic cluster plot across all the data is shown below.

A t-test was performed with the following results, indicating a statistically significant difference.

Welch Two Sample t-test

data: dt[, 1] and dt[, 2]
t = 3.7143, df = 80.722, p-value = 0.0003739
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.3026568 1.0010984
sample estimates:
mean of x mean of y
2.688740 2.036862

PISA – Australia and China Maths

Australian sample is randomly chosen from 720 school forming a sample of 14,000 students.
China sample is not random. It is skewed to the brightest students across 4 provinces nominated by the Chinese Government. This is the exception all other countries are sampled randomly.

The above shows the trend using the full Y-axis rather than exaggerating the decline by using a restricted access.

Climate Debate Issues

Climategate – as I understand it this is about the statistical techniques (trick) used to compare data sources and to ‘hide the decline’ of tree ring data from trees in Siberia after 1960 which were not consistent with other worldwide data. There have been plenty of enquires which all clear scientists of wrongdoing.

This website gave the best discussion I could find: I can’t vouch for the organization but it appears reputable.

Even if one accepted the skullduggery claims, the temperature data was not, it seems, involved.

With respect to the destruction of data, I found this blog article (best I could find), which seems at variance to the claims made by skeptics.

My conclusion is that there is not much that I can find to support the view expressed in a range of sites that the scientists were involved in a cover up.

Removal from Journal Editorial Boards or paper rejection

There are discussions of paper refusal – here is one example

I am concerned about the claim that climate science can move into activist territory and be biased – there needs to be evidence, is there hard evidence that this is preventing contrary views being expressed?

How Climate Skeptic papers can be published. This is an interesting perspective:

A literature review of climate skeptic paper published:

On this brief review, I can’t find any evidence of a hoax, I find disputes, attempts to set record straight and to explain, possible confusion, accusations etc. all of which are classic in a complex change process.

Misleading discussion Climate graphs

Misleading use of graphs-

US temperature Data

The graphs below show the 1999 published data from NASA. On left there seems to be no pattern on the right the global temperature anomalies show an increasing trend. Anomalies are measured by taking the difference between yearly temperatures and the 30 year average, see the Y-axis, 1.0 means that the yearly average was 1degree hotter than the 30 year average. In the US the temperatures in the 1930s were very hot over a wide spread area, this is the dust bowl period.

The NASA 2017 graph shows the same data but has the data from 2000 to 2018 shown and a statistical trend line. make the claim that ‘this is significant scientific fraud’, there is no evidence for this and the trend shown should give rise to concern.

Heatwave Index

This graph is used to state that there is no discernable pattern of temperature and that it was hotter in the 30s.

The Y-axis is not easy to interpret, but bear in mind this graph shows HEATWAVE conditions only, defined to be above the 10 year average over four consecutive days. The graph does not enable you to say if the average temperature is increasing or is the frequency of hot days increasing.

The graph is for temperature data across the US, the data shown on the following graphic is restricted to US cities.

The following graph shows that the frequency of heat waves and the length is increasing n US cities. (source

This graph below shows the change in hot days and hot nights shown as a percentage of land areas.

How do we interpret the graph? Hot nights are on the increase, there is recent dip in day time temperature coverage of land. The trend line is upwards.


Finally, the clam 30,000 scientists are against climate change dates to a petition circulating since 1998, there are numerous fact checks on this and I feel at the minimum such criticism ought to be taken into account before making such claims. Certainly they made Hoax claim.

In conclusion, the data and discussion on the Humansarefree is misleading an does not attempt a sound analysis.