Mysql creates frequency distribution

I have a simple BIRDCOUNT table below showing how many birds were counted on any given day:

+----------+
| NUMBIRDS |
+----------+
| 123      |
| 573      |
| 3        |
| 234      |
+----------+

I would like to create a frequency distribution graph showing how many times the number of birds was counted. Therefore, I need MySQL to create something like:

+------------+-------------+
| BIRD_COUNT | TIMES_SEEN  |
+------------+-------------+
| 0-99       | 17          |
| 100-299    | 23          |
| 200-399    | 12          |
| 300-499    | 122         |
| 400-599    | 3           |
+------------+-------------+

If the bird counting ranges were corrected, that would be easy. However, I never know min / max how many birds were visible. So I need a select statement that:

  • Creates a conclusion similar to the above, always creating 10 ranges of samples.
  • (more advanced) Generates a conclusion similar to the above, always creating N counting ranges.

I don't know if # 2 is possible in one choice, but can anyone solve # 1?

+6
source share
4
SELECT
    FLOOR( birds.bird_count / stat.diff ) * stat.diff as range_start, 
    (FLOOR( birds.bird_count / stat.diff ) +1) * stat.diff -1 as range_end, 
    count( birds.bird_count ) as times_seen
FROM birds_table birds, 
    (SELECT 
        ROUND((MAX( bird_count ) - MIN( bird_count ))/10) AS diff
    FROM birds_table
    ) AS stat
GROUP BY FLOOR( birds.bird_count / stat.diff )

;] , , , , . , 10. .

+6

SQL:

SELECT dateColumn, COUNT(*) AS NUMBIRDS
FROM birdTable
GROUP BY dateColumn

, , , "bin" :

SELECT CONCAT_WS('-', 
   FLOOR( NUMBIRDS/100 )*100,
   ((FLOOR( NUMBIRDS/100 )+1)*100) - 1
) AS BIRD_COUNT
,COUNT(*) AS TIMES_SEEN
FROM (
    SELECT dateColumn, COUNT(*) AS NUMBIRDS
    FROM birdTable
    GROUP BY dateColumn
) AS birdCounts
GROUP BY BIRD_COUNT

, , , LEFT JOIN, .

0

- GROUP BY . , , . , , .

- :

SELECT
  @low := TRUNCATE(bird_count/100, 0) * 100 as Low,
  TRUNCATE(@low + 99, 0) as High,
  COUNT(*) AS Count
FROM birds_seen
GROUP BY Low;

, , . , , , 123 145 "100", 234 246 - "200".

, .

0

@gustek , , h , k = \ ceil {(max - min) / h}

# Histogram generator using Scott rule, width(h) = (max - min) / k
SELECT any_value(FLOOR(r2.value / stat.width) * stat.width) as range_start,
       count(r2.value)                                      as times_seen,
FROM RESULT r2,
 (
     select 3.49 * stddev(r.value) / (power(count(*), 1 / 3)) as width
     from RESULT r
 ) as stat
GROUP BY FLOOR(r2.value / stat.width);

# Histogram using Rice rule k = ceil(2*n^1/3), width(h) = (max - min) / k
SELECT any_value(FLOOR(r2.value / stat.width) * stat.width) as range_start,
       count(r2.value)                                      as times_seen,
FROM RESULT r2,
 (
     select (max(r.value) - min(r.value)) / ceil(2 * power(count(*), 1 / 3)) as width
     from RESULT r
 ) as stat
GROUP BY FLOOR(r2.value / stat.width);

The function is any_value()used to work around a new MySQL problem ONLY_FULL_GROUP_BY.

0
source

All Articles