How to use ggplot to group and display the best categories of X?

I am trying to use ggplot to build production data by a company and use dot color to indicate year. The following diagram shows an example based on sample data: enter image description here

However, often my real data has 50-60 different combinations, thanks to which the company names along the Y axis can be grouped by type and not very asthmatically satisfied.

What is the easiest way to show data only for information about the top 5 companies (estimated by the quantization of 2011), and then display the rest of the aggregated and shown as "Other"?

Below are some sample data and the code I used to create the sample chart:

# create some sample data
c=c("AAA","BBB","CCC","DDD","EEE","FFF","GGG","HHH","III","JJJ")

q=c(1,2,3,4,5,6,7,8,9,10)
y=c(2010)
df1=data.frame(Company=c, Quantity=q, Year=y)

q=c(3,4,7,8,5,14,7,13,2,1)
y=c(2011)
df2=data.frame(Company=c, Quantity=q, Year=y)

df=rbind(df1, df2)

# create plot
p=ggplot(data=df,aes(Quantity,Company))+
  geom_point(aes(color=factor(Year)),size=4)
p

, , , , , . .

+3
2

:

    df2011 <- subset (df, Year == 2011)
    companies <- df2011$Company [order (df2011$Quantity, decreasing = TRUE)]
    ggplot (data = subset (df, Company %in% companies [1 : 5]), 
            aes (Quantity, Company)) +
            geom_point (aes (color = factor (Year)), size = 4)

: , , , ...

+6

, . df dataframe, @cbeleites. :

1. 2011 .

2.Split df : dftop, 5; dfother, ( ddply() plyr).

3. , dfnew.

4. , : , "". companies, "".

5.Plot -.

library(ggplot2)
library(plyr)

# Step 1
df2011 <- subset (df, Year == 2011)
companies <- df2011$Company [order (df2011$Quantity, decreasing = TRUE)]

# Step 2
dftop = subset(df, Company %in% companies [1:5])
dftop$Company = droplevels(dftop$Company)

dfother = ddply(subset(df, !(Company %in% companies [1:5])), .(Year), summarise, Quantity = sum(Quantity))
dfother$Company = "Other"

# Step 3
dfnew = rbind(dftop, dfother)

# Step 4
dfnew$Company = factor(dfnew$Company, levels = c("Other", rev(as.character(companies)[1:5])))
levels(dfnew$Company)    # Check that the levels are in the correct order

# Step 5
p = ggplot (data = dfnew, aes (Quantity, Company)) +
        geom_point (aes (color = factor (Year)), size = 4)
p

:

enter image description here

+3

All Articles