Search the web
Sign In
New User? Sign Up
perl-python
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
rename log files to date range   Message List  
Reply | Forward Message #101 of 127 |
Here's a example script that performs a simple task. The task is:

suppose you a bunch of files named
access_log.1.gz
access_log.2.gz
access_log.3.gz

these are weekly log files with their date range contained in the file.

you want to rename these files to indicate their date range.
20050101-20050105.gz
20050105-20050110.gz
20050110-20050115.gz

The following code is a pure Python script that does the job.

-------------------------------
# -*- coding: utf-8 -*-
# Python

import os,re,gzip

# go to the current dir
# get a list of file names of the form access_log.*.gz
# for each file
# unzip it (make sure it doesn't override existing file)
# find the date from the first line and last line
# rename the file such as 0605-0612.log.gz
# note: this description does not necessarily match the code
# but gives a basic view

mydir= '/Users/t/logs/'

mon={
'Jan':'01',
'Feb':'02',
'Mar':'03',
'Apr':'04',
'May':'05',
'Jun':'06',
'Jul':'07',
'Aug':'08',
'Sep':'09',
'Oct':'10',
'Nov':'11',
'Dec':'12',
}

def getdate (li):
li = li.split(' ')[3][1:12];
datelist = li.split('/');
dd=datelist[0]
mmm=datelist[1]
yyyy=datelist[2]
return str(yyyy) + str(mon[mmm]) + str(dd)

for ff in os.listdir(mydir):
if re.search(r'access_log\.\d\.gz',ff):
print ff
unzippedname=ff[0:-3]
inF = gzip.GzipFile(ff, 'rb');
s=inF.readlines()
inF.close()

start_date = getdate(s[0])
end_date = getdate(s[-1])
new_file_name= start_date + '-' + end_date + '.txt'

outF = file(new_file_name, 'w');
outF.write(''.join(s))
outF.close()

# do compression of the new files
# delete the original gzip file
s[0:-1]=[]
-------------------------------

This script does not use system call, so it is system independent and
can be used in Unix or Windows. The problem with this script is that
usually log files are very large in size, usually over 100 Mega bytes.
Because this script reads the whole file into memory, it is very slow
and memory intensive.

a solution is to call OS programs such as gzip to do the decompression,
and use unix “head” and “tail” to get the first or last line of the
file for extraction of the date range. We will study that and post
solution later.

Xah
xah@...
http://xahlee.org/





Wed Sep 7, 2005 10:15 am

p0lyglut
Offline Offline
Send Email Send Email

Forward
Message #101 of 127 |
Expand Messages Author Sort by Date

Here's a example script that performs a simple task. The task is: suppose you a bunch of files named access_log.1.gz access_log.2.gz access_log.3.gz these are...
xah lee
p0lyglut
Offline Send Email
Sep 7, 2005
10:27 am
Advanced

Copyright 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help