Find which lines in a file contain specific characters

Is there a way to find out if a string contains any of the characters in a set with python?

It’s easy to do this with a single character, but I need to check and see if the string contains any of the many bad characters.

In particular, suppose I have a line:

s = 'amanaplanacanalpanama~012345'

and I want to see if the string contains any vowels:

bad_chars = 'aeiou'

and do it in a for loop for each line in the file:

if [any one or more of the bad_chars] in s:
    do something

I am browsing a large file, so if there is a faster way, this will be perfect. In addition, you do not need to check every bad character - as long as it occurs, which is enough to complete the search.

I'm not sure if there is a built-in function or an easy way to implement this, but I haven't found anything yet. Any pointers would be greatly appreciated!

+5
5
any((c in badChars) for c in yourString)

any((c in yourString) for c in badChars)  # extensionally equivalent, slower

set(yourString) & set(badChars)  # extensionally equivalent, slower

", , ." - , .

, : , . , :


edit re, , [...] .finditer, , , . , , , . ( ) (, , \, w, ] [, - , \w, ).


, str.__contains__ O (1) O (N), / , , in O (1), badChars:

badCharSet = set(badChars)
any((c in badChars) for c in yourString)

( any((c in set(yourString)) for c in badChars), , python)


?

O (#badchars), O (# lines * # badchars), , .

+9

python any.

if any((bad_char in my_string) for bad_char in bad_chars):
    # do something 
+4

. :

#!/usr/bin/python

bad_chars = set('aeiou')

with open('/etc/passwd', 'r') as file_:
   file_string = file_.read()
file_chars = set(file_string)

if file_chars & bad_chars:
   print('found something bad')
+2

, any . .

r = re.compile('[aeiou]')
if r.search(s):
    # do something
+1

The following Python code should print any character in bad_chars if it exists in s:

for i in vowels:
    if i in your charset:
        #do_something

You can also use the built-in python using an example similar to this:

>>> any(e for e in bad_chars if e in s)
True
0
source

All Articles