Create a web scraper in Python 2.6.4 + Scrapy. Need data analysis, but also my first Python learning project. Failed to create SQL INSERT statement in my pipe.py article. A real query contains approximately 30 attributes to insert.
First, is there a better way to write this UPDATE or INSERT algorithm? Discover for improvement.
Secondly, there are two different syntax options and various errors that they create. I tried a lot of example-based options, but I can't find an example using the "INSERT SET", which breaks into several lines. What is the correct syntax?
The DB is empty, so we always fork to the INSERT block.
def _conditional_insert(self, tx, item):
tx.execute("SELECT username FROM profiles_flat WHERE username = %s", (item['username'][0], ))
result = tx.fetchone()
if result:
tx.execute( \
"""UPDATE profiles_flat SET
username=`%s`,
headline=`%s`,
age=`%s`
WHERE username=`%s`""", ( \
item['username'],
item['headline'],
item['age'],)
item['username'],)
)
else:
tx.execute( \
"""INSERT INTO profiles_flat SET
username=`%s`,
headline=`%s`,
age=`%s` """, ( \
item['username'],
item['headline'],
item['age'], )
)
Error:
[Failure instance: Traceback: <class '_mysql_exceptions.OperationalError'>: (1054, "Unknown column ''missLovely92 '' in 'field list'")
/usr/lib/python2.6/threading.py:497:__bootstrap
/usr/lib/python2.6/threading.py:525:__bootstrap_inner
/usr/lib/python2.6/threading.py:477:run
--- <exception caught here> ---
/usr/lib/python2.6/vendor-packages/twisted/python/threadpool.py:210:_worker
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:59:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:37:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/enterprise/adbapi.py:429:_runInteraction
/export/home/raven/scrapy/project/project/pipelines.py:222:_conditional_insert
/usr/lib/python2.6/vendor-packages/MySQLdb/cursors.py:166:execute
/usr/lib/python2.6/vendor-packages/MySQLdb/connections.py:35:defaulterrorhandler
]
Alternative syntax:
query = """INSERT INTO profiles_flat SET
username=`%s`,
headline=`%s`,
age=`%s` """ % \
item['username'],
item['headline'],
item['age']
tx.execute(query)
Error:
[Failure instance: Traceback: <type 'exceptions.TypeError'>: not enough arguments for format string
/usr/lib/python2.6/threading.py:497:__bootstrap
/usr/lib/python2.6/threading.py:525:__bootstrap_inner
/usr/lib/python2.6/threading.py:477:run
--- <exception caught here> ---
/usr/lib/python2.6/vendor-packages/twisted/python/threadpool.py:210:_worker
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:59:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/python/context.py:37:callWithContext
/usr/lib/python2.6/vendor-packages/twisted/enterprise/adbapi.py:429:_runInteraction
/export/home/raven/scrapy/project/project/pipelines.py:196:_conditional_insert
]