Count the number of lines containing a letter / number

Question

Count the number of lines containing a letter / number

What I'm trying to achieve is simple, but it’s a little difficult to explain, and I don’t know if this is really possible in postgres. I am at a fairly basic level. SELECT, FROM, WHERE, LEFT JOIN ON, HAVING, etc main things.

I am trying to count the number of lines containing a specific letter / number and display them counting against a letter / number.

ie How many lines contain entries containing "a / A" (case insensitive)

The table I am querying is a list of movie names. All I want to do is group and count "az" and "0-9" and display the totals. I could run 36 queries sequentially:

SELECT filmname FROM films WHERE filmname ilike '%a%'
SELECT filmname FROM films WHERE filmname ilike '%b%'
SELECT filmname FROM films WHERE filmname ilike '%c%'

And then run pg_num_rows on the result to find the number I need, etc.

I know as intense as I am, even more so I would prefer to avoid this. Although the data (below) has upper and lower case in the data, I want the result sets to be case insensitive. "People who look at the goats," a / A, t / T and s / S will not be counted twice for a set of results. I can duplicate the table into a secondary worksheet in which all the data is strtolower and work on this dataset for the query, if it simplifies or simplifies creating the query.

An alternative might be something like

SELECT sum(length(regexp_replace(filmname, '[^X|^x]', '', 'g'))) FROM films;

for each combination of letters, but again 36 queries, 36 data sets, I would prefer to get the data in one query.

Here is a short data set of 14 films from my set (actually contains 275 rows)

District 9
Surrogates
The Invention Of Lying
Pandorum
UP
The Soloist
Cloudy With A Chance Of Meatballs
The Imaginarium of Doctor Parnassus
Cirque du Freak: The Vampires Assistant
Zombieland
9
The Men Who Stare At Goats
A Christmas Carol
Paranormal Activity

, , , x , , , :, x , , .

:

A  x x  xxxx xxx  9 
B       x  x      2 
C x     xxx   xx  6
D x  x  xxxx      6
E  xx  xxxxx x    8
F   x   xxx       4 
G  xx    x   x    4
H   x  xxxx  xx   7
I x x  xxxxx  xx  9
J                 0
K         x       0
L   x  xx  x  xx  6
M    x  xxxx xxx  8
N   xx  xxxx x x  8
O  xxx xxx x xxx  10
P    xx  xx    x  5
Q         x       1
R xx x   xx  xxx  7
S xx   xxxx  xx   8
T xxx  xxxx  xxx  10
U  x xx xxx       6
V   x     x    x  3
W       x    x    2
X                 0 
Y   x   x      x  3
Z          x      1 
0                 0  
1                 0  
2                 0 
3                 0
4                 0
5                 0
6                 0
7                 0
8                 0
9 x         x     1

"filmname". , 5 "u" "p", 11 "9" . - .

- , : A 9, B 2, C 6, D 6, E 8 e.t.c , . , .

, php 36 .

275 , 8,33 (100 ). , 2019 1000 , , , , .

- " : " 50 (, ;-), - 1, "9" .

9.0.0 Postgres.

, , , , .

- , , , , .

.

1

Erwin //. .

"9" , Erwin. .

kgrittn, , 9.0.0. , .

Erwin

, , .

, , , ( ), .

36 /, count (ct).

, , .

SELECT DISTINCT id, unnest(string_to_array(lower(film), NULL)) AS letter
FROM  films

" ". , , .

, 14 " NULL"

COALESCE(y.ct, 0) to COALESCE(y.ct, 4)<br />

4 , .

COALESCE, "4" . , y.ct NULL ( , , .. "q", "q" , NULL?)

, , SQL_ASCII, , - , 8.4.0 UTF-8.

, , .

?

, .

+3

sql aggregate-functions count postgresql

George 10 '12 16:07

4

, , .

SELECT 
    'a', SUM( (title ILIKE '%a%')::integer),
    'b', SUM( (title ILIKE '%b%')::integer),
    'c', SUM( (title ILIKE '%c%')::integer)
FROM film

33 :)

BTW 1000 postgresql. , , .

edit:

SELECT chars.c, COUNT(title)
FROM (VALUES ('a'), ('b'), ('c')) as chars(c)
    LEFT JOIN film ON title ILIKE ('%' || chars.c || '%')
GROUP BY chars.c
ORDER BY chars.c

(VALUES ('a'), ('b'), ('c')) chars (c) , .

0

Eelke 10 '12 16:27

.

SELECT
  SUM(CASE WHEN POSITION('a' IN filmname) > 0 THEN 1 ELSE 0 END) AS "A",
  SUM(CASE WHEN POSITION('b' IN filmname) > 0 THEN 1 ELSE 0 END) AS "B",
  SUM(CASE WHEN POSITION('c' IN filmname) > 0 THEN 1 ELSE 0 END) AS "C",
  ...
  SUM(CASE WHEN POSITION('z' IN filmname) > 0 THEN 1 ELSE 0 END) AS "Z",
  SUM(CASE WHEN POSITION('0' IN filmname) > 0 THEN 1 ELSE 0 END) AS "0",
  SUM(CASE WHEN POSITION('1' IN filmname) > 0 THEN 1 ELSE 0 END) AS "1",
  ...
  SUM(CASE WHEN POSITION('9' IN filmname) > 0 THEN 1 ELSE 0 END) AS "9"
FROM films;

0

gregjor 10 '12 16:45

, Erwins, , , :

:

CREATE TABLE char (name char (1), id serial);
INSERT INTO char (name) VALUES ('a');
INSERT INTO char (name) VALUES ('b');
INSERT INTO char (name) VALUES ('c');

:

SELECT char.name, COUNT(*) 
  FROM char, film 
  WHERE film.name ILIKE '%' || char.name || '%' 
  GROUP BY char.name 
  ORDER BY char.name;

ILIKE.

I'm not 100% satisfied that I used the 'char' keyword as the name of the table, but still haven't had a bad experience. On the other hand, it is a natural name. Perhaps if you translate it into another language - for example, "zeichen" in German, you avoid ambiguity.

0

user unknown May 12, '12 at 2:23

source share

Erwin Brandstetter · Accepted Answer · 2012-05-10T16:33:32+0000

:

CREATE TEMP TABLE films (id serial, film text);
INSERT INTO films (film) VALUES
 ('District 9')
,('Surrogates')
,('The Invention Of Lying')
,('Pandorum')
,('UP')
,('The Soloist')
,('Cloudy With A Chance Of Meatballs')
,('The Imaginarium of Doctor Parnassus')
,('Cirque du Freak: The Vampires Assistant')
,('Zombieland')
,('9')
,('The Men Who Stare At Goats')
,('A Christmas Carol')
,('Paranormal Activity');

Query:

SELECT l.letter, COALESCE(y.ct, 0) AS ct
FROM  (
    SELECT chr(generate_series(97, 122)) AS letter  -- a-z in UTF8!
    UNION ALL
    SELECT generate_series(0, 9)::text              -- 0-9
    ) l
LEFT JOIN (
    SELECT letter, count(id) AS ct
    FROM  (
        SELECT DISTINCT  -- count film once per letter
               id, unnest(string_to_array(lower(film), NULL)) AS letter
        FROM   films
        ) x
    GROUP  BY 1
    ) y  USING (letter)
ORDER  BY 1;

PostgreSQL 9.1! :

string_to_array(), NULL (Pavel Stehule)
.

regexp_split_to_table(lower(film), '') unnest(string_to_array(lower(film), NULL)) ( pre-9.1!), , , , .
generate_series() [a-z0-9] . LEFT JOIN , .
DISTINCT .
1000 . PostgreSQL .

Count the number of lines containing a letter / number

More articles: