Zhen Zhang's Blog

R map part 2 -- ggplot

ggplot can draw the map in a more decent way. But the data ggplot deal with will not be our original data. Instead, it will first transform the data using the function fortify() implicitly.

Basics

Now we use ggplot to draw the map:

1
2
3
ggplot(china_map1, aes(x = long, y = lat, group = group)) +
geom_path(color = "grey40") +
geom_polygon(fill = 'beige')

chinaMap

Let me explain these commands one by one.

The first line, ggplot() function’s arguments, includes the data and the aesthetic options. Here we use the original dataset china_map1, but ggplot() will transform it to fortify(china_map1) automatically, with the message Regions “defined for each Polygons”. That is why we cannot see long and lat variables in china_map2 = china_map1@data, they are in the dataset china_map3 = fortify(china_map1).

The second line, geom_path() is used instead of geom_point(), since we need to connect the points by the order in the data frame, not their x-lab position in the diagram. The color option set the color of the border.

The third line, geom_polygon just set the filled in color for each polygon.

But the map is a little wired: It does not look like the map we see everyday. This can be achieved by adding this command coord_map().

China’s map

Province Map in different colors

First, I create a big dataset, which includes all the information in china_map2 and china_map3.

1
2
3
4
china_map2$NAME <- iconv(china_map2$NAME, from = "GBK") ## transform to UTF-8 coding format
china_map2x <- data.frame(china_map2,id=seq(0:924)-1) ## prepare to join by id
library(plyr)
china_map4 <- join(china_map3, china_map2x, type = "full")

Now we can have a preview of china_map4:

1
2
3
4
5
6
7
8
head(china_map4)
## long lat order hole piece group id AREA PERIMETER BOU2_4M_ BOU2_4M_ID ADCODE93 ADCODE99 NAME
## 1 121.4884 53.33265 1 FALSE 1 0.1 0 54.447 68.489 2 23 230000 230000 黑龙江省
## 2 121.4995 53.33601 2 FALSE 1 0.1 0 54.447 68.489 2 23 230000 230000 黑龙江省
## 3 121.5184 53.33919 3 FALSE 1 0.1 0 54.447 68.489 2 23 230000 230000 黑龙江省
## 4 121.5391 53.34172 4 FALSE 1 0.1 0 54.447 68.489 2 23 230000 230000 黑龙江省
## 5 121.5738 53.34818 5 FALSE 1 0.1 0 54.447 68.489 2 23 230000 230000 黑龙江省
## 6 121.5840 53.34964 6 FALSE 1 0.1 0 54.447 68.489 2 23 230000 230000 黑龙江省

Now we paint different colors to different provinces:

1
2
3
4
5
ggplot(china_map4, aes(x = long, y = lat, group = group, fill = NAME)) +
geom_path(color = 'grey40') +
geom_polygon() +
scale_fill_manual(values = rainbow(33), guide = F) +
coord_map()

China Map

Province Map in different colors with population

The previous diagram is filled with names, 33 distinct colors. Now I am going to fill the regions with population. The data of China’s population comes from the 2010 national six census.

1
2
3
4
NAME <- c("北京市", "天津市", "河北省", "山西省", "内蒙古自治区", "辽宁省", "吉林省", "黑龙江省", "上海市", "江苏省", "浙江省", "安徽省", "福建省", "江西省", "山东省", "河南省", "湖北省", "湖南省", "广东省", "广西壮族自治区", "海南省", "重庆市", "四川省", "贵州省", "云南省", "西藏自治区", "陕西省", "甘肃省", "青海省", "宁夏回族自治区", "新疆维吾尔自治区", "台湾省", "香港特别行政区")
pop <- c(7355291, 3963604, 20813492, 10654162, 8470472, 15334912, 9162183, 13192935, 8893483, 25635291, 20060115, 19322432, 11971873, 11847841, 30794664, 26404973, 17253385, 19029894, 32222752, 13467663, 2451819, 10272559, 26383458, 10745630, 12695396, 689521, 11084516, 7113833, 1586635, 1945064, 6902850, 23193638, 7026400)
pop <- data.frame(NAME, pop)
china_map_pop <- join(china_map4, pop, type = "full")

After the data processing, we can draw now:

1
2
3
4
ggplot(china_map_pop, aes(x = long, y = lat, group = group, fill = pop)) +
geom_polygon() +
geom_path(color = "grey40") +
coord_map()

china_pop

Map for a particular province from country level

Here we use the province of Zhejiang for test.

1
2
3
4
5
6
7
zhejiang <- subset(china_map4, NAME == "浙江省") ## distract Zhejiang from the data
ggplot(zhejiang,aes(x = long, y = lat, group = id)) +
geom_polygon(fill = "beige") +
geom_path(color = "grey40") +
geom_point(x = 120.12, y = 30.16, fill = NA) +
annotate("text", x = 120.3, y = 30, label = "杭州市",family = "STKaiti") +
coord_map()

Zhejiang

China’s Provice Map

It is time to go deeper. In this part, I will deal with map data in the level of province.

One place, one polygon

First we will see the simple case: the place only has one polygon attached. The place we choose will be Beijing.

1
2
3
4
5
china_region <- readShapePoly("mapsData/maps/bou4/BOUNT_poly.shp") ## read the province level data
china_region <- transform(china_region, NAME99 = iconv(NAME99, from = "GBK")) ## Transform from GBK to UTF-8
china_region$ADCODE99[grep("北京", china_region$NAME99)]
## [1] 110100 110100 110100
## 2368 Levels: 0 110100 110112 110113 110221 110224 110226 110227 110228 110229 120100 120221 120222 120223 120224 ... 820000

Here we need to know some knowledge about ADCODE99. The first two digits, 11, is referred to province, here is Beijing. The next two digits stands for city part, here including 01 and 02. The last two represent districts.

Here we need to grep the first two digits, and let them be equal to 11:

1
2
3
4
beijing_original <- china_region[substr(as.character(mydat$ADCODE99), 1, 2) == "11",] ## extract beijing data
beijing <- fortify(beijing_original, region="NAME99") ## use fortify to transform data
beijing <- transform(beijing,id = iconv(id, from = 'GBK'), group = iconv(group, from = 'GBK')) ## convert encoding
names(beijing)[1:2] = c("x","y") ## I don't know why, but I have to do it in order to use geom_map().

Now we generate some random numbers to show the diagram.

1
2
beijing_data <- data.frame(id = unique(sort(beijing$id)))
beijing_data$rand = runif(length(beijing_data$id))

Then we can draw the map:

1
2
3
4
5
6
beijing_map = ggplot(beijing_data) +
geom_map(aes(map_id = id, fill = rand), color = "white", map = beijing) +
scale_fill_gradient(high = "darkgreen",low = "lightgreen") +
expand_limits(beijing) +
coord_map()
print(beijing_map)

beijing_map

The package of sp give us an option of calculating the coordinate (longitude and latitude) of a map polygon:

1
2
beijing_center <- coordinates(beijing_original)
beijing_center

Then we can add text annotation to our previous beijing plot:

1
2
3
4
beijing_center <- as.data.frame(beijing_center)
beijing_center$names <- iconv(beijing_original$NAME99, from = 'GBK') ## add center text
beijing_center
beijing_center$names[c(7,8,10)] <- c("北京市市辖区1","北京市市辖区2","北京市市辖区3") ## There is a mistake in the original dataset, which means there are three '北京市市辖区', so I have to change them to three different values

Here is the plot:

1
beijing_map + geom_text(aes(x = V1,y = V2,label = names), family = "STKaiti", data = beijing_center)

beijing_map_text

One place, multiple polygons

Here we take Shanghai as the example. The preprocess of data is as previous:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## First extract data
shanghai_original = china_region[substr(as.character(mydat$ADCODE99), 1, 2) == '31',]
shanghai = fortify(shanghai_original, region = 'NAME99')
shanghai = transform(shanghai, id = iconv(id, from = 'GBK'), group = iconv(group, from = 'GBK'))
names(shanghai)[c(1, 2, 6, 7)] = c("x", "y", "id", "code") ## dark magic as before
head(shanghai)
## Next we generate random data
shanghai_data <- data.frame(id = unique(sort(shanghai$id)))
shanghai_data$rand = runif(length(shanghai_data$id))
shanghai_data
## id rand
## 1 上海市市辖区.1 0.72448945
## 2 上海市市辖区.2 0.59063385
## 3 上海市市辖区.3 0.84606343
## 4 上海市市辖区.4 0.86794623
## 5 上海市市辖区.5 0.56679646
## 6 南汇县.1 0.21214315
## 7 嘉定区.1 0.72355893
## 8 奉贤县.1 0.37875115
## 9 崇明县.1 0.96325270
## 10 崇明县.2 0.38970185
## 11 崇明县.3 0.84432254
## 12 崇明县.4 0.31187179
## 13 崇明县.5 0.20808703
## 14 崇明县.6 0.02448647
## 15 松江区.1 0.08611160
## 16 金山区.1 0.21856591
## 17 金山区.2 0.79596114
## 18 闵行区.1 0.59523458
## 19 青浦县.1 0.53470385
ggplot(shanghai_data) +
geom_map(aes(map_id = id, fill = rand), map = shanghai) + expand_limits(shanghai) +
coord_map()

shanghai_map

Next post, I will talk about another powerful package – ggmap.

References

  1. Douban R by shilululululu
  2. 用R软件绘制中国分省市地图 by 邱怡轩
  3. 用ggmap包进行地震数据的可视化 by 肖凯
  4. R绘制中国地图,并展示流行病学数据 by 姜晓东