pylogsparser-0.4/ 0000755 0001750 0001750 00000000000 11715707344 012216 5 ustar fbo fbo pylogsparser-0.4/normalizers/ 0000755 0001750 0001750 00000000000 11715707344 014563 5 ustar fbo fbo pylogsparser-0.4/normalizers/postfix.xml 0000644 0001750 0001750 00000015456 11705765631 017016 0 ustar fbo fbo
Postfix log normalization.
Postfix logs consist of a message UID and a list of keys and values.
Normalized keys are "client", "to", "from", "orig_to", "relay", "size", "message-id" and "status".
Ce normaliseur analyse les logs émis par le service Postfix.
Les messages Postfix consistent en un UID de message et une liste variable de clés et de valeurs associées.
Les clés extraites par ce normaliseur sont "client", "to", "from", "orig_to", "relay", "size", "message-id" et "status".
mhu@wallix.com
the hexadecimal message UID
l'UID associé au message, exprimé sous forme d'un nombre hexadécimal
[0-9A-F]{11}
# find the component and trim the program
if log.get("program", "").startswith("postfix"):
log["component"] = log['program'][8:]
log["program"] = "postfix"
ACCEPTED = [ "client",
"to",
"from",
"orig_to",
"relay",
"size",
"message-id",
"status" ]
# re to trying to match client and relay address
r=re.compile('(?P<host>[A-Za-z0-9\-\.]+)(?P<ip>\[.*\])?(?P<port>\:\d+)?$')
couples = value.split(', ')
for couple in couples:
tagname, tagvalue = couple.split('=', 1)
if tagname in ACCEPTED:
tagvalue = tagvalue.strip('<>')
log[tagname] = tagvalue
if tagname == 'status':
log[tagname] = log[tagname].split()[0]
TRANSLATE = {"to": "message_recipient",
"from": "message_sender",
"size": "len",
"message-id": "message_id"}
for k,v in TRANSLATE.items():
if k in log.keys():
val = log[k]
del log[k]
log[v] = val
if 'client' in log.keys():
host, ip, port = r.match(log['client']).groups()
if host:
log['source_host'] = host
if ip:
log['source_ip'] = ip.strip("[]")
if port:
log['source_port'] = port.strip(':')
if 'relay' in log.keys():
host, ip, port = r.match(log['relay']).groups()
if host:
log['dest_host'] = host
if ip:
log['dest_ip'] = ip.strip("[]")
if port:
log['dest_port'] = port.strip(':')
postfix.+
Generic postfix message with an UID and many key-values couples.
Message Postfix générique comportant un UID et plusieurs couples clé-valeur.
UID: KEYVALUES
the Postfix message UID
l'UID du message
UID
the Postfix key-value couples
les couples clé-valeur du log
KEYVALUES
decode_postfix_key_value
74275790B06: to=<root@ubuntu>, orig_to=<root>, relay=none, delay=0.91, delays=0.31/0.07/0.53/0, dsn=5.4.4, status=bounced (Host or domain name not found. Name service error for name=ubuntu type=A: Host not found)
74275790B06
root@ubuntu
root
none
bounced
mail
pylogsparser-0.4/normalizers/cisco-asa_header.xml 0000644 0001750 0001750 00000020765 11710522746 020465 0 ustar fbo fbo
This normalizer is able to parse logs received via the syslog
export facility from a Cisco ASA. The normalizer has been validated with Cisco ASA version 8.4.
The standard export format (No EMBLEM format) with "Device ID" and "timestamp" options must be
selected for this normalizer.
Ce normaliseur reconnaît les logs Cisco ASA exportés via
la facilité syslog. Ce normaliseur a été validé avec la version 8.4 de l'IOS Cisco ASA.
Le format d'export standard (Non EMBLEM) doit être sélectionné avec les options "Device ID" et
"timestamp".
fbo@wallix.com
Expression matching a syslog line priority, defined as 8*facility + severity.
Expression correspondant à la priorité du message, suivant la formule 8 x facilité + gravité.
\d{1,3}
Expression matching a date in the DDD MMM dd hh:mm:ss YYYY format.
Expression correspondant à la date au format DDD MMM dd hh:mm:ss YYYY.
[A-Z]{1}[a-z]{2} [0-9]{1,2} [0-9]{4} \d{2}:\d{2}:\d{2}
Expression matching the device ID.
Expression correspondant à l'identifiant de l'équipement.
[^: ]+
# define facilities
FACILITIES = { 0: "kernel",
1: "user",
2: "mail",
3: "daemon",
4: "auth",
5: "syslog",
6: "print",
7: "news",
8: "uucp",
9: "ntp",
10: "secure",
11: "ftp",
12: "ntp",
13: "audit",
14: "alert",
15: "ntp" }
for i in range(0, 8):
FACILITIES[i+16] = "local%d" % i
# define severities
SEVERITIES = { 0: "emerg",
1: "alert",
2: "crit",
3: "error",
4: "warn",
5: "notice",
6: "info",
7: "debug" }
facility = int(value) / 8
severity = int(value) % 8
if facility not in FACILITIES or severity not in SEVERITIES:
raise ValueError('facility or severity is out of range')
log["facility"] = "%s" % FACILITIES[facility]
log["severity"] = "%s" % SEVERITIES[severity]
log["facility_code"] = "%d" % facility
log["severity_code"] = "%d" % severity
SEVERITIES = { 0: "emerg",
1: "alert",
2: "crit",
3: "error",
4: "warn",
5: "notice",
6: "info",
7: "debug" }
log["severity_code"] = "%s" % str(value)
log["severity"] = "%s" % SEVERITIES[int(value)]
Expression matching the Cisco ASA Syslog header
Expression validant l'entête Syslog d'un équipement Cisco ASA
<PRIORITY>DATE SOURCE : %ASA-SEVERITY-MNEMONIC: BODY
the log's priority
la priorité du log, égale à 8 x facilité + gravité
PRIORITY
decode_priority
the log's date
l'horodatage du log
DATE
MMM dd YYYY hh:mm:ss
the log's source (device ID)
l'équipement ASA à l'origine de l'événement
SOURCE
the log's severity
la severité du log
SEVERITY
decode_asa_severity
the Cisco ID of the event
l'identifiant Cisco de l'évenement
MNEMONIC
the actual event message
le message décrivant l'événement
BODY
cisco-asa
<165>Jan 25 2012 18:31:09 ciscoasa : %ASA-5-111008: User 'enable_15' executed the 'logging host inside2 192.168.30.2 6/11508' command.
local4
notice
5
ciscoasa
2012-01-25 18:31:09
111008
cisco-asa
User 'enable_15' executed the 'logging host inside2 192.168.30.2 6/11508' command.
pylogsparser-0.4/normalizers/UserAgent.xml 0000644 0001750 0001750 00000010563 11715703401 017175 0 ustar fbo fbo
This normalizer extracts additional info from the useragent field in a HTTP request.
Ce normaliseur extrait des données supplémentaires des du champ useragent présent dans les requêtes HTTP.
mhu@wallix.com
m = extras.robot_regex.search(value)
if m:
log["search_engine_bot"] = m.group().lower()
known_os = {"Mac OS" : "Mac/Apple",
"Windows" : "Windows",
"Linux" : "Linux"}
guess = "unknown"
for i,j in known_os.items():
if i in value:
guess = j
break
log['source_os'] = guess
USERAGENT
USERAGENT
findBot
guessOS
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
baiduspider
unknown
Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.10
Mac/Apple
Nokia6680/1.0 (4.04.07) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 Configuration/CLDC-1.1
unknown
pylogsparser-0.4/normalizers/URLparser.xml 0000644 0001750 0001750 00000007313 11715703401 017156 0 ustar fbo fbo
This normalizer extracts additional info from URLs such as domain, protocol, etc.
Ce normaliseur extrait des données supplémentaires des URLs telles que le domaine, le protocole, etc.
mhu@wallix.com
parsed = urlparse.urlparse(value)
if parsed.hostname:
log['url_hostname'] = parsed.hostname
log['url_domain'] = extras.get_domain(parsed.hostname)
if parsed.path:
log['url_path'] = parsed.path
if parsed.scheme:
log['url_proto'] = parsed.scheme
URL
URL
decodeURL
http://www.wallix.org/2011/09/20/how-to-use-linux-containers-lxc-under-debian-squeeze/
www.wallix.org
http
/2011/09/20/how-to-use-linux-containers-lxc-under-debian-squeeze/
wallix.org
pylogsparser-0.4/normalizers/cisco-asa_msg.xml 0000644 0001750 0001750 00000104603 11710541240 020003 0 ustar fbo fbo
This normalizer is able to parse logs from Cisco ASA devices.
This normalizer performs log normalisation on log body extracts by cisco-asa_header parser.
The normalizer has been tested with Cisco ASA IOS version 8.4.
Ce normaliseur reconnaît les logs des équipements Cisco ASA.
La normalisation est réalisée sur le corp du log extrait par le normaliseur cisco-asa_header.
Ce normaliseur a été testé avec un Cisco ASA IOS en version 8.4.
fbo@wallix.com
Matches a word.
Un mot.
[^ ]+
Matches a protocol word.
Un protocol.
(?:(?:TCP)|(?:UDP)|(?:ICMP)|(?:tcp)|(?:udp)|(?:icmp)|\d+)
Matches an action word.
Une action.
(?:(?:denied)|(?:permitted)|(?:Successful)|(?:Rejected)|(?:failed)|(?:succeeded))
Matches an authentication word.
Un type d'autorisation.
(?:(?:authentication)|(?:authorization)|(?:accounting)|(?:Authentication)|(?:Authorization))
log['protocol'] = value.lower()
log['action'] = value.lower()
log['type'] = value.lower()
cisco-asa
Format of log event_id 305011
Format du log event_id 305011
Built TYPE PROTOCOL translation from SINT:SIP/SPT to DINT:DIP/DPT
Translation type
Le type de translation
TYPE
The protocol of translated connection.
Le protocole de la connexion translatée.
PROTOCOL
lower_protocol
Inbound interface
Interface d'entrée
SINT
Source IP address
Adresse IP source
SIP
Source port
Port source
SPT
Outbound interface
Interface de sortie
DINT
Destination IP address
Adresse IP de destination
DIP
Destination port
Port de destination
DPT
Built dynamic TCP translation from inside:192.168.1.50/1107 to outside:172.22.1.254/1025
dynamic
tcp
inside
172.22.1.254
1025
192.168.1.50
Format of logs event_id 302015/302013
Format des logs event_id 302015/302013
Built (?:(?:inbound)|(?:outbound)) PROTOCOL connection ID for SINT:SIP/SPT \([^ ]+/\d+\) to DINT:DIP/DPT \([^ ]+/\d+\)(?: \(USER\))?
The protocol of translated connection.
Le protocole de la connexion translatée.
PROTOCOL
lower_protocol
Connection ID
Identifiant de connexion
ID
Inbound interface
Interface d'entrée
SINT
Source IP address
Adresse IP source
SIP
Source port
Port source
SPT
Outbound interface
Interface de sortie
DINT
Destination IP address
Adresse IP de destination
DIP
Destination port
Port de destination
DPT
User related to this event
Utilisateur en rapport avec cet événement
USER
Built inbound UDP connection 732748 for outside:192.168.208.63/49804 (192.168.208.63/49804) to inside:192.168.150.70/53 (192.168.150.70/53)
732748
udp
outside
192.168.208.63
192.168.150.70
53
Built inbound TCP connection 733280 for outside:192.168.208.63/51606 (192.168.208.63/51606) to inside:192.168.150.70/80 (192.168.150.70/80) (myuser)
733280
tcp
outside
192.168.208.63
192.168.150.70
80
myuser
Format of log event_id 106023
Format du log event_id 106023
Deny PROTOCOL src SINT:SIP(?:/SPT)? dst DINT:DIP(?:/DPT)?(?: \(.*\))? by access-group "GROUP" .*
The protocol of translated connection.
Le protocole de la connexion translatée.
PROTOCOL
lower_protocol
Inbound interface
Interface d'entrée
SINT
Source IP address
Adresse IP source
SIP
Source port
Port source
SPT
Outbound interface
Interface de sortie
DINT
Destination IP address
Adresse IP de destination
DIP
Destination port
Port de destination
DPT
Group related to this event
Groupe en rapport avec cet événement
GROUP
Deny icmp src outside:192.168.208.63 dst inside:192.168.150.77 (type 8, code 0) by access-group "OUTSIDE" [0xd3f63b90, 0x0]
icmp
outside
192.168.208.63
192.168.150.77
OUTSIDE
Deny tcp src outside:192.168.208.63/51585 dst inside:192.168.150.77/288 by access-group "OUTSIDE" [0x5063b82f, 0x0]
tcp
outside
192.168.208.63
192.168.150.77
288
OUTSIDE
Format of log event_id 106010
Format du log event_id 106010
Deny inbound protocol PROTOCOL src SINT:SIP dst DINT:DIP
The protocol of translated connection.
Le protocole de la connexion translatée.
PROTOCOL
lower_protocol
Inbound interface
Interface d'entrée
SINT
Source IP address
Adresse IP source
SIP
Outbound interface
Interface de sortie
DINT
Destination IP address
Adresse IP de destination
DIP
Deny inbound protocol 47 src outside:192.168.0.1 dst outside:127.0.0.10
47
outside
outside
192.168.0.1
127.0.0.10
Format of logs event_id 605005/605004
Format des logs event_id 605005/605004
Login ACTION from SIP/SPT to SINT:DIP/DPT for user "USER"
Source IP address
Adresse IP source
SIP
Inbound interface
Interface d'entrée
SINT
Destination IP address
Adresse IP de destination
DIP
Source port
Port source
SPT
Destination port
Port de destination
DPT
User related to this event
Utilisateur en rapport avec cet événement
USER
Action taken by the device
Action prise par l'équipement
ACTION
lower_action
Login permitted from 192.168.202.51/3507 to inside:192.168.2.20/ssh for user "admin"
inside
3507
192.168.2.20
admin
permitted
Format of logs event_id 113004/113005
Format des logs event_id 113004/113005
AAA user AAATYPE ACTION : (?:reason = [^:]+: )?server = DIP : user = USER
AAA type
AAA type
AAATYPE
lower_type
Destination IP address
Adresse IP de destination
DIP
User related to this event
Utilisateur en rapport avec cet événement
USER
Action taken by the device
Action prise par l'équipement
ACTION
lower_action
AAA user authentication Successful : server = 10.1.206.27 : user = userx
10.1.206.27
userx
authentication
successful
AAA user authentication Rejected : reason = AAA failure : server = 10.10.1.2 : user = vpn_user
10.10.1.2
vpn_user
authentication
rejected
Format of logs event_id 109005/109006/109007/109008
Format des logs event_id 109005/109006/109007/109008
AAATYPE ACTION for user 'USER' from SIP/SPT to DIP/DPT on interface SINT
AAA type
AAA type
AAATYPE
lower_type
Action taken by the device
Action prise par l'équipement
ACTION
lower_action
User related to this event
Utilisateur en rapport avec cet événement
USER
Source IP address
Adresse IP source
SIP
Source port
Port source
SPT
Destination IP address
Adresse IP de destination
DIP
Destination port
Port de destination
DPT
Inbound interface
Interface d'entrée
SINT
Authentication succeeded for user 'userjane' from 172.28.4.41/0 to 10.1.1.10/24 on interface outside
10.1.1.10
24
userjane
authentication
succeeded
outside
Authorization denied for user 'user1' from 192.168.208.63/57315 to 192.168.134.21/21 on interface outside
192.168.134.21
21
57315
user1
authorization
denied
outside
Format of logs event_id 611101/611102
Format des logs event_id 611101/611102
User authentication ACTION: Uname: USER
Action taken by the device
Action prise par l'équipement
ACTION
lower_action
User related to this event
Utilisateur en rapport avec cet événement
USER
User authentication succeeded: Uname: alex
alex
succeeded
Generic pattern matching logs like 109024/109025/201010/109023/...
Règle générique pour les logs de type 109024/109025/201010/109023/...
.+ from SIP/SPT to DIP/DPT (?:\(.+\) )?on interface DINT(?:(?: using PROTOCOL)|.+|$)
Source IP address
Adresse IP source
SIP
Source port
Port source
SPT
Destination IP address
Adresse IP de destination
DIP
Destination port
Port de destination
DPT
Inbound interface
Interface d'entrée
DINT
The protocol of translated connection.
Le protocole de la connexion translatée.
PROTOCOL
lower_protocol
User related to this event
Utilisateur en rapport avec cet événement
USER
Authorization denied from 111.111.111.111/12345 to 222.222.222.222/12345 (not authenticated) on interface inside using https
inside
https
Authorization denied (acl=RS1) for user 'username' from 10.10.10.9/137 to 10.10.10.255/137 on interface outside using UDP
outside
udp
10.10.10.9
137
User from 192.168.5.2/56985 to 192.168.100.2/80 on interface outside must authenticate before using this service
outside
80
Generic pattern matching logs like 108003/410002/324005/421007/500005/109028/608002/...
Règle générique pour les logs de type 108003/410002/324005/421007/500005/109028/608002/...
.+ (?:from|for) (?:SINT:)?SIP/SPT to (?:DINT:)?DIP/DPT.*
Source IP address
Adresse IP source
SIP
Source port
Port source
SPT
Destination IP address
Adresse IP de destination
DIP
Destination port
Port de destination
DPT
Inbound interface
Interface d'entrée
SINT
Outbound interface
Interface de sortie
DINT
Dropped 189 DNS responses with mis-matched id in the past 10 second(s): from outside:192.0.2.2/3917 to inside:192.168.60.1/53
outside
inside
Generic pattern trying to match a user id in a log
Règle essayant de trouver un identifiant dans un log
.+(?:(?:[U|u]ser =)|(?:Uname:)|(?:Username =)) USER.*
User related to this event
Utilisateur en rapport avec cet événement
USER
[aaa protocol] Unable to decipher response message Server = 10.10.3.2, User = fbo
fbo
pylogsparser-0.4/normalizers/GeoIPsource.xml 0000644 0001750 0001750 00000007244 11645625573 017505 0 ustar fbo fbo
This filter evaluates the country of origin associated to the source_ip tag.
Ce filtre détermine le pays d'origine associé à la valeur du tag source_ip.
mhu@wallix.com
country = country_code_by_address(value)
if country:
log['source_country'] = country
This pattern simply checks the source_ip tag.
Ce motif se contente d'analyser le tag source_ip.
IP
IP
decodeCountryOfOrigin
8.8.8.8
US
77.207.23.14
FR
pylogsparser-0.4/normalizers/xferlog.xml 0000644 0001750 0001750 00000043142 11710220641 016740 0 ustar fbo fbo
This normalizer handles FTP logs in the xferlog format.
This format is supported by a wide range of FTP servers like Wu-Ftpd, VSFTPd, ProFTPD or standard BSD ftpd.
The "program" tag is therefore set to the generic value "ftpd".
Ce normaliseur traite les logs au format xferlog.
Le format xferlog est utilisé pour consigner les événements par de nombreux serveurs FTP, tels que
Wu-Ftpd, ProFTPD ou la version BSD de ftpd.
La métadonnée "program" reçoit de fait la valeur générique "ftpd".
clo@wallix.com
Expression matching a date in the DDD MMM dd hh:mm:ss YYYY format.
[A-Z]{1}[a-z]{2} [A-Z]{1}[a-z]{2} \d{1,2} \d{2}:\d{2}:\d{2} \d{4}
Expression matching a vsftpd field (any non-whitespace character).
\S+
Expression matching a vsftpd field more accurately than the 'vsftpd field' tagType. Possible values are a or b, see the description of the tag [with the same name] for details.
a|b
Expression matching a vsftpd field more accurately than the 'vsftpd field' tagType. Possible values are _, C, U or T, see the description of the tag [with the same name] for details.
_|C|T|U
Expression matching a vsftpd field more accurately than the 'vsftpd field' tagType. Possible values are o or i, see the description of the tag [with the same name] for details.
o|i
Expression matching a vsftpd field more accurately than the 'vsftpd field' tagType. Possible values are a, g or r, see the description of the tag [with the same name] for details.
a|g|r
Expression matching a vsftpd field more accurately than the 'vsftpd field' tagType. Possible values are 0 or 1, see the description of the tag [with the same name] for details.
0|1
Expression matching a vsftpd field more accurately than the 'vsftpd field' tagType. Possible values are c or i, see the description of the tag [with the same name] for details.
c|i
decoder = {'a' : 'ascii', 'b' : 'binary'}
log['transfer_type'] = decoder.get(value, 'UNKNOWN')
decoder = {'C' : 'compressed', 'U' : 'uncompressed', 'T' : "tar'ed", "_" : "none"}
log['special_action'] = decoder.get(value, 'UNKNOWN')
decoder = {'o' : 'outgoing', 'i' : 'ingoing'}
log['direction'] = decoder.get(value, 'UNKNOWN')
decoder = {'a' : 'anonymous', 'g' : 'guest', 'r' : 'real'}
log['access_mode'] = decoder.get(value, 'UNKNOWN')
decoder = {'0' : 'none', '1' : 'RFC931'}
log['authentication_method'] = decoder.get(value, 'UNKNOWN')
decoder = {'c' : 'complete', 'i' : 'incomplete'}
log['completion_status'] = decoder.get(value, 'UNKNOWN')
DATE\s+TSF_TIME\s+RMT_HOST\s+BYT_COUNT\s+FILENAME\s+TSF_TYPE\s+SPE_ACT_FLAG\s+DIR\s+ACC_MODE\s+USERNAME\s+SVC_NAME\s+AUTH_METHOD\s+AUTHENTICATED_USER_ID\s+COMPLETION_STATUS
The current local time in the form "DDD MMM dd hh:mm:ss YYYY", where DDD is the day of the week, MMM is the month, dd is the day of the month, hh is the hour, mm is the min-utes, ss is the seconds, and YYYY is the year.
DATE
DDD MMM dd hh:mm:ss YYYY
The total time of the transfer in seconds.
TSF_TIME
The remote host name.
RMT_HOST
The amount of transferred bytes.
BYT_COUNT
The canonicalized (all symbolic links are resolved) abso-lute pathname of the transferred file. In case of the chrooted FTP session this field can be interpreted as the pathname in the chrooted environment(the default interpretation) or as the one in the realfile system. The second type of interpretation can be enabled by the command-line options of the ftpd.
FILENAME
he single character that indicates the type of the trans-fer. The set of possible values is: 'a' (an ascii transfer) or 'b' (a binary transfer).
TSF_TYPE
decode_transfer_type
One or more single character flags indicating any special action taken. The set of possible values is: '_'(no action was taken), 'C'(the file was compressed [not in use]), 'U'(the file was uncompressed [not in use]) or 'T'(the file was tar'ed [not in use]).
SPE_ACT_FLAG
decode_special_action_flag
The direction of the transfer. The set of possible values is: 'o'(the outgoing transfer) or 'i'(the incoming transfer)
DIR
decode_direction
The method by which the user is logged in. The set of possible values is: 'a'[anonymous](the anonymous guest user), 'g'[guest](the real but chrooted user [this capability is guided by ftpchroot(5) file]) or 'r'[real](the real user).
ACC_MODE
decode_access_mode
The user's login name in case of the real user, or the user's identification string in case of the anonymous user (by convention it is an email address of the user).
USERNAME
The name of the service being invoked. The ftpd (utility uses the 'ftp' keyword).
SVC_NAME
The used method of the authentication. The set of possible values is: '0' None or '1' RFC931 Authentication (not in use).
AUTH_METHOD
decode_authentication_method
The user id returned by the authentication method. The '*' symbol is used if an authenticated user id is not available.
AUTHENTICATED_USER_ID
The single character that indicates the status of the transfer. The set of possible values is: 'c' a complete transfer or 'i' an incomplete transfer.
COMPLETION_STATUS
decode_completion_status
Thu Mar 4 08:12:30 2004 1 202.114.40.242 37 /incoming/index.html a _ o a guest@my.net ftp 0 * c
1
202.114.40.242
37
/incoming/index.html
a
ascii
_
none
o
outgoing
a
anonymous
guest@my.net
ftp
0
none
*
complete
c
ftpd
file transfer
ftpd
pylogsparser-0.4/normalizers/MSExchange2007MessageTracking.xml 0000644 0001750 0001750 00000040374 11705765631 022602 0 ustar fbo fbo
This parser defines how to normalize specific MS Exchange flat files based on observed behavior of MS Exchange 2007 (trial version); while it would have to be confirmed that it is consistent with other versions, it is likely that it won't cause any trouble.
This parser describes the format of Exchange 2007's Message Tracking Log (something similar to Postfix logs), a CSV-like flat file that can be found at
C:\Program Files\Microsoft\Exchange Server\TransportRoles\Logs\MessageTracking on a standard install.
Ce normaliseur analyse certains fichiers de logs générés par MS Exchange 2007. Bien que ce normaliseur ait été écrit par rétro-analyse du comportement d'une version d'évaluation, il devrait être adapté aux versions complètes d'Exchange.
Ce normaliseur décrit le format du "Message Tracking Log" (un journal d'événements similaire à celui d'un serveur Postfix), un fichier plat de type CSV qui se trouve à l'emplacement suivant dans une installation standard : C:\Program Files\Microsoft\Exchange Server\TransportRoles\Logs\MessageTracking .
mhu@wallix.com
The log's specific dateformat
Le format d'horodatage spécifique à ce type de log
\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}[.]\d{3}Z
the message ID
l'identifiant de message
<.+@.+>
the source context
le contexte de la source
"?(?:[^,]+, )*[^,]+"?
log['message_id'] = value[1:-1]
if value.startswith('"'):
value = value[1:-1]
d = dict( [ u.split(':', 1) for u in value.split(', ') ] )
# convert camelCase fields into underscore names
r = re.compile('[A-Z][a-z0-9]+')
rdate = re.compile("""
(?P<year>\d{4})-
(?P<month>\d{2})-
(?P<day>\d{2})
T(?P<hour>\d{2}):
(?P<minute>\d{2}):
(?P<second>\d{2})\.
(?P<microsecond>\d{1,3})?Z""", re.VERBOSE)
for name, v in d.items():
new_name = name
words = r.findall(name)
if words:
new_name = '_'.join(words)
new_value = v
if rdate.match(v):
m = rdate.match(v).groupdict()
m.setdefault('microsecond', 0)
m = dict( [ (u, int(v)) for u,v in m.items() ] )
m['microsecond'] = m['microsecond'] * 1000
new_value = datetime( **m ).ctime()
del d[name]
d[new_name.lower()] = new_value
log.update(d)
The Message Tracking Log Format as described in the first line of the log file
Le format du Message Tracking Log, tel qu'il apparaît en première ligne du journal d'événements
DATE,CLIENT_IP,CLIENT_HOSTNAME,SERVER_IP,SERVER_HOSTNAME,CONTEXT_EXCHANGE_SOURCE,CONNECTOR_ID,EXCHANGE_SOURCE,EVENT_ID,INTERNAL_MESSAGE_ID,MESSAGE_ID,RECIPIENT_ADDRESS,RECIPIENT_STATUS,TOTAL_BYTES,RECIPIENT_COUNT,RELATED_RECIPIENT_ADDRESS,REFERENCE,MESSAGE_SUBJECT,SENDER_ADDRESS,RETURN_PATH,MESSAGE_INFO
the log's timestamp
l'horodatage de l'événement
DATE
ISO8601
the client's IP address
l'adresse IP du client
CLIENT_IP
the client's hostname
le nom d'hôte du client
CLIENT_HOSTNAME
the server's IP address
l'adresse IP du serveur
SERVER_IP
the server's hostname
le nom d'hôte du serveur
SERVER_HOSTNAME
the source context
le contexte de la source
CONTEXT_EXCHANGE_SOURCE
decode_MTLSourceContext
the connector ID
l'identifiant du connecteur
CONNECTOR_ID
EXCHANGE_SOURCE
the event ID
l'identifiant d'événement
EVENT_ID
the internal message ID
l'identifiant interne du message
INTERNAL_MESSAGE_ID
the message ID
l'identifiant du message
MESSAGE_ID
decode_MTLMessageID
the recipient's address
l'adresse du destinataire
RECIPIENT_ADDRESS
the recipient's status
le statut du destinataire
RECIPIENT_STATUS
total bytes in the transaction
le nombre de bits total pour la transaction
TOTAL_BYTES
RECIPIENT_COUNT
RELATED_RECIPIENT_ADDRESS
REFERENCE
the message's subject
le sujet du message
MESSAGE_SUBJECT
the sender's address
l'adresse de l'expéditeur
SENDER_ADDRESS
the return path
l'adresse de retour
RETURN_PATH
some additional information about the message
de l'information supplémentaire sur le message
MESSAGE_INFO
MS Exchange 2007 Message Tracking
2010-04-19T12:29:07.390Z,10.10.14.73,WIN2K3DC,,WIN2K3DC,"MDB:ada3d2c3-6f32-45db-b1ee-a68dbcc86664, Mailbox:68cf09c1-1344-4639-b013-3c6f8a588504, Event:1440, MessageClass:IPM.Note, CreationTime:2010-04-19T12:28:51.312Z, ClientType:User",,STOREDRIVER,SUBMIT,,<C6539E897AEDFA469FE34D029FB708D43495@win2k3dc.qa.ifr.lan>,,,,,,,Coucou !,user7@qa.ifr.lan,,
MS Exchange 2007 Message Tracking
10.10.14.73
WIN2K3DC
WIN2K3DC
68cf09c1-1344-4639-b013-3c6f8a588504
User
STOREDRIVER
SUBMIT
C6539E897AEDFA469FE34D029FB708D43495@win2k3dc.qa.ifr.lan
Coucou !
user7@qa.ifr.lan
mail
pylogsparser-0.4/normalizers/common_tagTypes.xml 0000644 0001750 0001750 00000012443 11710220606 020443 0 ustar fbo fbo
]>
Matches everything and anything.
Chaîne de caractères de longueur arbitraire.
.*
Matches a variable-length integer.
Entier positif.
\d+
Matches an EPOCH timestamp or a positive decimal number.
Horodatage au format EPOCH, ou nombre décimal positif.
\d+(?:.\d*)?
Expression matching syslog dates.
Date au format syslog.
[A-Z][a-z]{2} [ 0-9]\d \d{2}:\d{2}:\d{2}
Matches an URL.
Correspond à une URL (http/https).
http[s]?://[^ "'*]+
Matches a MAC address.
Correspond à une adresse MAC.
[0-9a-fA-F]{2}:(?:[0-9a-fA-F]{2}:){4}[0-9a-fA-F]{2}
Matches an E-mail address.
Correspond à une adresse e-mail.
[a-zA-Z0-9+_\-\.]+@[0-9a-zA-Z][.-0-9a-zA-Z]*.[a-zA-Z]+
Matches a numeric IP.
Correspond à une adresse IP numérique.
(?<![.0-9])(?:\d{1,3}.){3}\d{1,3}(?![.0-9])
Matches a date written in Zulu Time
Correspond à une date exprimée au format "Zulu" ou UTC.
\d{4}-\d{2}-\d{2}(?:T\d{1,2}:\d{2}(?::\d{2}(?:[.]\d{1,5})?)?)?
pylogsparser-0.4/normalizers/arkoonFAST360.xml 0000644 0001750 0001750 00000045454 11710225177 017513 0 ustar fbo fbo
fbo@wallix.com
.*$
\d+
# define facilities
FACILITIES = { 0: "kernel",
1: "user",
2: "mail",
3: "daemon",
4: "auth",
5: "syslog",
6: "print",
7: "news",
8: "uucp",
9: "ntp",
10: "secure",
11: "ftp",
12: "ntp",
13: "audit",
14: "alert",
15: "ntp" }
for i in range(0, 8):
FACILITIES[i+16] = "local%d" % i
# define severities
SEVERITIES = { 0: "emerg",
1: "alert",
2: "crit",
3: "error",
4: "warn",
5: "notice",
6: "info",
7: "debug" }
facility = int(value) / 8
severity = int(value) % 8
if facility not in FACILITIES or severity not in SEVERITIES:
raise ValueError('facility or severity is out of range')
log["facility"] = "%s" % FACILITIES[facility]
log["severity"] = "%s" % SEVERITIES[severity]
log["facility_code"] = "%d" % facility
log["severity_code"] = "%d" % severity
# Key that must be found in log
mandatory_keys = ('id', 'time', 'gmtime',
'fw', 'aktype',
)
key_modifiers = {'pri' : 'priority',
'op' : 'method',
'aktype': 'event_id',
'src': 'source_ip',
'dst': 'dest_ip',
'port_src': 'source_port',
'port_dest': 'dest_port',
'dstname': 'dest_host',
'intf_in': 'inbound_int',
'intf_out': 'outbound_int'}
def extract_fw(data):
ip_re = re.compile("(?<![.0-9])((?:[0-9]{1,3}[.]){3}[0-9]{1,3})(?![.0-9])")
if ip_re.match(data['fw']):
data['local_ip'] = data['fw']
else:
data['local_host'] = data['fw']
def extract_protocol(data):
if 'proto' in data.keys():
if data['proto'].find('/') > 0:
nump, protocol = data['proto'].split('/')
else:
protocol = data['proto']
data['protocol'] = protocol
del data['proto']
def quote_stripper(data):
for k in data.keys():
data[k] = data[k].strip('"')
def extract_date(data):
data['date'] = datetime.utcfromtimestamp(float(data['gmtime']))
for key in ('time', 'gmtime'):
del data[key]
def alert_description_modify(data):
messages = [
re.compile("TCP from (?P<source_ip>.+):(?P<source_port>.+) to (?P<dest_ip>.+):(?P<dest_port>.+)\s+\[(?P<description>.*)\]"),
re.compile("UDP from (?P<source_ip>.+):(?P<source_port>.+) to (?P<dest_ip>.+):(?P<dest_port>.+)\s+\[(?P<description>.*)\]"),
re.compile('ICMP:(?P<dest_port>.+)\.(?P<source_port>.+) from (?P<source_ip>.+) to (?P<dest_ip>.+) \[(?P<description>.*)\]'),
re.compile('PROTO:(?P<protocol>.+) from (?P<source_ip>.+) to (?P<dest_ip>.+) \[(?P<description>.*)\]'),
re.compile('Unsequenced packet on non-TCP proto from (?P<source_ip>.+):(?P<source_port>.+)'),
re.compile('Unsequenced TCP packet from (?P<source_ip>.+):(?P<source_port>.+) to (?P<dest_ip>.+):(?P<dest_port>.+)'),
re.compile('ACK unsequenced packet on non-TCP proto from (?P<source_ip>.+):(?P<source_port>.+)'),
re.compile('ACK unsequenced TCP packet from (?P<source_ip>.+):(?P<source_port>.+) to (?P<dest_ip>.+):(?P<dest_port>.+)'),
re.compile('Bad flags on non-TCP proto from (?P<source_ip>.+):(?P<source_port>.+)'),
re.compile('Bad TCP flags (?P<flags>.+) from (?P<source_ip>.+):(?P<source_port>.+) to (?P<dest_ip>.+):(?P<dest_port>.+)'),
re.compile('Bad packet from (?P<source_ip>.+):(?P<source_port>.+) to (?P<dest_ip>.+):(?P<dest_port>.+) \[(?P<description>.*)\]'),
re.compile('Land attack from (?P<source_ip>.+) to (?P<dest_ip>.+)'),
re.compile('New value: (?P<source_ip>.+)/(?P<network_mask>.+) \[(?P<ports>.+)\]'),
]
if 'alert_desc' in data.keys():
for m in messages:
values = m.match(data['alert_desc'])
if values:
data.update(values.groupdict())
def profile_modifier(data):
profiles = {
1: 'FTP_BADFILES',
2: 'FTP_SCAN',
3: 'FTP',
4: 'HTTP',
5: 'HTTP_BADURL',
6: 'HTTP_COLDFUSION',
7: 'HTTP_FRONTPAGE',
8: 'HTTP_IIS',
9: 'HTTP_PHP',
10: 'HTTP_NETSCAPE',
11: 'HTTP_TOMCAT',
12: 'HTTP_APACHE',
13: 'HTTP_WINDOWS',
14: 'HTTP_ORACLE',
15: 'HTTP_TALENTSOFT',
16: 'HTTP_LOTUS',
17: 'HTTP_UNIX',
18: 'HTTP_CISCO',
19: 'HTTP_WEBLOGIC',
20: 'HTTP_MYSQL',
21: 'HTTP_MACOS',
22: 'HTTP_VIRUSWALL',
23: 'SMTP',
24: 'IMAP4',
25: 'POP3',
26: 'DNS'}
if 'profile' in data.keys():
data['profile'] = profiles.get(int(data['profile']), data['profile'])
def reason_modify(data):
messages = [
re.compile('Virus (?P<virus_name>.+) found in (?P<file_name>.+)'),
re.compile('File (?P<file_name>.+) encrypted'),
re.compile('File (?P<file_name>.+): analyze error'),
re.compile('Denied by rule (?P<rule_name>.+)'),
re.compile('Denied by rule (?P<rule_name>.+), put mail in quarantine')
]
if 'reason' in data.keys():
for m in messages:
values = m.match(data['reason'])
if values:
data.update(values.groupdict())
kvre = '(?P<key>[A-Za-z_\-]{2,})=(?P<val>[^" ]+|"[^"]*")'
reg = re.compile(kvre)
data = reg.findall(value)
data = dict(data)
# Verify it is the expected log
if not set(mandatory_keys).issubset(set(data.keys())):
return log
if data['id'] != 'firewall':
return log
# Remove quoted values
quote_stripper(data)
# Set tag body
data['body'] = value
# Add a date field from gmtime field
extract_date(data)
# Extract useful fields from alert description
alert_description_modify(data)
# Convert IDPS profile
profile_modifier(data)
# SMTP reason modifier
reason_modify(data)
# Apply keys modifiers
for k, v in key_modifiers.items():
if k in data.keys():
s_val = data[k]
del data[k]
data[v] = s_val
# Process fw tag
extract_fw(data)
# Process proto field
extract_protocol(data)
# Remove tag with empty value
for k, v in data.items():
if not v:
del data[k]
# Convert tag name with hyphen to underscore
for k, v in data.items():
if k.find('-') > -1:
del data[k]
k = k.replace('-', '_')
data[k] = v
log['program'] = 'arkoon'
log.update(data)
(?:<PRIORITY>[^\s]+:\s)?AKLOG\s*-\s*KEYVALUES
KEYVALUES
extractAKkv
PRIORITY
decode_priority
AKLOG - id=firewall time="2004-02-25 17:38:51" pri=4 fw=myArkoon aktype=ALERT gmtime=1077727131 alert_type="Blocked by application control" user="userName" alert_level="Low" alert_desc="TCP from 10.10.192.61:33027 to 10.10.192.156:25 [default rule]"
arkoon
ALERT
4
myArkoon
userName
Low
10.10.192.156
25
10.10.192.61
33027
default rule
Low
id=firewall time="2004-02-25 17:38:51" pri=4 fw=myArkoon aktype=ALERT gmtime=1077727131 alert_type="Blocked by application control" user="userName" alert_level="Low" alert_desc="TCP from 10.10.192.61:33027 to 10.10.192.156:25 [default rule]"
firewall
AKLOG-id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IP gmtime=1077727137 ip_log_type=ENDCONN src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 intf_in=eth0 intf_out= pkt_len=78 nat=NO snat_addr=0 snat_port=0 dnat_addr=0 dnat_port=0 user="userName" pri=3 rule="myRule" action=DENY reason="Blocked by filter" description="dst addr received from Internet is private"
arkoon
IP
3
myArkoon
userName
10.10.192.255
10.10.192.61
Blocked by filter
ENDCONN
eth0
udp
2004-02-25 16:38:57
id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IP gmtime=1077727137 ip_log_type=ENDCONN src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 intf_in=eth0 intf_out= pkt_len=78 nat=NO snat_addr=0 snat_port=0 dnat_addr=0 dnat_port=0 user="userName" pri=3 rule="myRule" action=DENY reason="Blocked by filter" description="dst addr received from Internet is private"
firewall
AKLOG-id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IDPSMATCH gmtime=1077727137 src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 profile=1 sid=123 score=50
arkoon
IDPSMATCH
myArkoon
10.10.192.255
10.10.192.61
50
FTP_BADFILES
udp
2004-02-25 16:38:57
id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IDPSMATCH gmtime=1077727137 src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 profile=1 sid=123 score=50
firewall
AKLOG-id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IDPSALERT gmtime=1077727137 src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 profile=1 endcnx_score=100 ch=1 reaction=0
arkoon
IDPSALERT
myArkoon
10.10.192.255
10.10.192.61
137
137
1
FTP_BADFILES
udp
2004-02-25 16:38:57
id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IDPSALERT gmtime=1077727137 src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 profile=1 endcnx_score=100 ch=1 reaction=0
firewall
AKLOG-id=firewall time="2004-02-25 17:42:54" fw=myArkoon pri=6 aktype=HTTP gmtime=1077727374 src=10.10.192.61 proto=http user="userName" op="GET" dstname=www arg="http://www/ HTTP/1.1" ref="" agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1" rcvd=355 result=407
arkoon
HTTP
myArkoon
10.10.192.61
http://www/ HTTP/1.1
407
GET
www
http
2004-02-25 16:42:54
id=firewall time="2004-02-25 17:42:54" fw=myArkoon pri=6 aktype=HTTP gmtime=1077727374 src=10.10.192.61 proto=http user="userName" op="GET" dstname=www arg="http://www/ HTTP/1.1" ref="" agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020623 Debian/1.0.0-0.woody.1" rcvd=355 result=407
firewall
<134>IP-Logs: AKLOG - id=firewall time="2010-10-04 10:38:37" gmtime=1286181517 fw=doberman.jurassic.ta aktype=IP ip_log_type=NEWCONN src=172.10.10.107 dst=204.13.8.181 proto="http" protocol=6 port_src=2619 port_dest=80 intf_in=eth7 intf_out=eth2 pkt_len=48 nat=HIDE snat_addr=10.10.10.199 snat_port=16176 dnat_addr=0 dnat_port=0 tcp_seq=1113958286 tcp_ack=0 tcp_flags="SYN" user="" vpn-src="" pri=6 rule="surf_normal" action=ACCEPT
arkoon
IP
doberman.jurassic.ta
http
surf_normal
ACCEPT
id=firewall time="2010-10-04 10:38:37" gmtime=1286181517 fw=doberman.jurassic.ta aktype=IP ip_log_type=NEWCONN src=172.10.10.107 dst=204.13.8.181 proto="http" protocol=6 port_src=2619 port_dest=80 intf_in=eth7 intf_out=eth2 pkt_len=48 nat=HIDE snat_addr=10.10.10.199 snat_port=16176 dnat_addr=0 dnat_port=0 tcp_seq=1113958286 tcp_ack=0 tcp_flags="SYN" user="" vpn-src="" pri=6 rule="surf_normal" action=ACCEPT
firewall
pylogsparser-0.4/normalizers/sshd.xml 0000644 0001750 0001750 00000012676 11705765631 016264 0 ustar fbo fbo
This normalizer can parse connection messages logged by a SSH server.
Ce normaliseur analyse les événements de connexion à un serveur SSH.
mhu@wallix.com
matches the action logged for a connection
correspond à l'action de connexion
Failed|Accepted
if value == "Failed":
log['action'] = 'fail'
else:
log['action'] = 'accept'
A generic sshd log line.
Une notification standard de connexion à un serveur SSH.
ACTION METHOD for(?: invalid user)? USER from IP port [0-9]+ ssh[0-9]
the outcome of the connection attempt
le résultat de la tentative de connexion
ACTION
decode_action
the connection method (password or key)
la méthode de connexion utilisée (mot de passe ou clé asymétrique)
METHOD
the user requesting the connection
l'utilisateur à l'origine de la connexion
USER
the inbound connection's IP address
l'IP entrante de la connexion
IP
Failed password for admin from 218.49.183.17 port 49468 ssh2
fail
password
admin
218.49.183.17
access control
pylogsparser-0.4/normalizers/apache.xml 0000644 0001750 0001750 00000034006 11705765631 016533 0 ustar fbo fbo
Apache normalizer. This parser supports log formats defined in apache's documentation, see http://httpd.apache.org/docs/current/logs.html .
Ce normaliseur analyse les logs émis par les serveurs web Apache. Seuls les formats décrits dans la documentation Apache sont supportés en standard : cf http://httpd.apache.org/docs/current/logs.html .
mhu@wallix.com
Matches apache's common time format.
Une expression correspondant au format d'horodatage par défaut d'Apache.
\[\d{1,2}/.{3}/\d{4}:\d{1,2}:\d{1,2}:\d{1,2}(?: [+-]\d{4})?\]
IP address or None.
Une adresse IP, ou un champ vide.
(?:(?:\d{1,3}\.){3}\d{1,3})|-
Integer or float, or None.
Une valeur numérique entière ou décimale, ou un champ vide.
[\d.,]+|-
DN, user name ...
Un "mot", ou un champ vide.
[\w.-]+|-
try:
path = value.split(' ')[1].split('?')[0]
log['url_path'] = path
log['method'] = value.split(' ')[0]
except:
pass
Common Log Format.
Structure des logs selon le schéma "Common Log Format".
%h %l %u %t "%r" %>s %b$
the remote host initiating the request
l'hôte distant à l'initiative de la requête
%h
the remote logname used to initiate the request
l'identifiant distant à l'initiative de la requête
%l
the remote user initiating the request
l'utilisateur distant à l'initiative de la requête
%u
the time at which the request was issued - please note that the timezone information is not carried over
la date à laquelle la requête a été émise. Veuillez noter que l'information de fuseau horaire n'est pas prise en compte
%t
dd/MMM/YYYY:hh:mm:ss
the first line of the request
la première ligne de la requête
%r
decode_url_path
the final status code for the request
le code de statut final pour la requête
%>s
the size of the response in bytes, including HTTP headers
la taille de la réponse émise en octets, en-têtes HTTP inclus
%b
127.0.0.1 - - [20/Jul/2009:00:29:39 +0300] "GET /index/helper/test HTTP/1.1" 200 889
127.0.0.1
-
-
GET /index/helper/test HTTP/1.1
200
889
apache
/index/helper/test
GET
web server
"Combined" Log Format.
Structure des logs selon le schéma "Combined".
%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-agent}i"$
the remote host initiating the request
l'hôte distant à l'initiative de la requête
%h
the remote logname used to initiate the request
l'identifiant distant à l'initiative de la requête
%l
the remote user initiating the request
l'utilisateur distant à l'initiative de la requête
%u
the time at which the request was issued - please note that the timezone information is not carried over
la date à laquelle la requête a été émise. Veuillez noter que l'information de fuseau horaire n'est pas prise en compte
%t
dd/MMM/YYYY:hh:mm:ss
the first line of the request
la première ligne de la requête
%r
decode_url_path
the final status code for the request
le code de statut final pour la requête
%>s
the size of the response in bytes, including HTTP headers
la taille de la réponse émise en octets, en-têtes HTTP inclus
%b
the contents of the "Referer" request header
le contenu de l'en-tête "Referer" de la requête
%{Referer}i
the contents of the "User-agent" request header
le contenu de l'en-tête "User-agent" de la requête
%{User-agent}i
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
127.0.0.1
-
frank
GET /apache_pb.gif HTTP/1.0
200
2326
apache
http://www.example.com/start.html
Mozilla/4.08 [en] (Win98; I ;Nav)
/apache_pb.gif
GET
web server
apache
pylogsparser-0.4/normalizers/wabauth.xml 0000644 0001750 0001750 00000026707 11705765631 016756 0 ustar fbo fbo
This normalizer is used to parse Wallix Admin Bastion authentication logs.
nba@wallix.com
primary_authentication
primary_authentication
session opened
session closed
A user name as defined in Wallix Admin Bastion
[^: ]+
an IP
[^: ]+
A arbitrary string
[^: ]*
an even raised when a user is trying to authenticate itself on the WAB
type='PRIMARY_AUTHENTICATION' timestamp='[^']+' username='USERNAME' client_ip='CLIENT_IP' diagnostic='DIAG'
The even type
PRIMARY_AUTHENTICATION
The user being used.
USERNAME
The ip of the client being connected.
CLIENT_IP
Connexion attempt result.
DIAG
type='primary_authentication' timestamp='2011-12-20 16:21:50.427830' username='admin' client_ip='10.10.4.25' diagnostic='SUCCESS'
primary_authentication
admin
10.10.4.25
SUCCESS
access control
type='SESSION_OPENED' username='USERNAME' secondary='ACCOUNT@RESOURCE' client_ip='CLIENT_IP' src_protocol='SOURCE_PROTO' dst_protocol='DEST_PROTO' message='MESSAGE'
The even type
SESSION_OPENED
The user being used.
USERNAME
The target account used.
ACCOUNT
The target/resource accessed
RESOURCE
The ip of the client being connected.
CLIENT_IP
The protocol used by the client to connect to the wab
SOURCE_PROTO
the protocol used by the WAB to connect to the target/resource.
DEST_PROTO
Other comment.
MESSAGE
type='session opened' username='admin' secondary='root@debian32' client_ip='10.10.4.25' src_protocol='SFTP_SESSION' dst_protocol='SFTP_SESSION' message=''
root
10.10.4.25
SFTP_SESSION
debian32
SFTP_SESSION
session opened
admin
access control
an even raised when a user is trying to authenticate itself on the WAB
type='SESSION_CLOSED' username='USERNAME' secondary='ACCOUNT@RESOURCE' client_ip='CLIENT_IP' src_protocol='SOURCE_PROTO' dst_protocol='DEST_PROTO' message='MESSAGE'
The even type
SESSION_CLOSED
The user being used.
USERNAME
The target account used.
ACCOUNT
The target/resource accessed
RESOURCE
The ip of the client being connected.
CLIENT_IP
The protocol used by the client to connect to the wab
SOURCE_PROTO
the protocol used by the WAB to connect to the target/resource.
DEST_PROTO
Other comment.
MESSAGE
type='session closed' username='admin' secondary='root@debian32' client_ip='10.10.4.25' src_protocol='SFTP_SESSION' dst_protocol='SFTP_SESSION' message=''
root
10.10.4.25
SFTP_SESSION
debian32
SFTP_SESSION
session closed
admin
access control
pylogsparser-0.4/normalizers/Fail2ban.xml 0000644 0001750 0001750 00000022313 11705765631 016726 0 ustar fbo fbo
This normalizer can parse Fail2ban logs (version 0.8.4).
Ce normaliseur traite les logs de l'applicatif Fail2ban (version 0.8.4).
mhu@wallix.com
\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}
fail2ban
\w+
(?:Ban)|(?:Unban)
timestamp, milliseconds = value.split(',', 1)
newdate = datetime(int(timestamp[:4]),
int(timestamp[5:7]),
int(timestamp[8:10]),
int(timestamp[11:13]),
int(timestamp[14:16]),
int(timestamp[17:19]))
log["date"] = newdate.replace(microsecond = int(milliseconds) * 1000 )
An information message about the application's general status.
Un message informatif concernant le statut général de l'application.
TIMESTAMP PROGRAM\.COMPONENT\s*: INFO\s+BODY
TIMESTAMP
decodeF2bTimeStamp
the program, set to "fail2ban"
le programme, cette métadonnée est toujours évaluée à "fail2ban"
PROGRAM
the program's component emitting the log
le composant du programme à l'origine du message
COMPONENT
the body of the message
le descriptif de l'événement
BODY
2011-09-27 05:02:26,908 fail2ban.server : INFO Changed logging target to /var/log/fail2ban.log for Fail2ban v0.8.4
fail2ban
server
Changed logging target to /var/log/fail2ban.log for Fail2ban v0.8.4
TIMESTAMP PROGRAM\.COMPONENT\s*: WARNING\s+\[PROTOCOL\] ACTION SOURCE_IP
TIMESTAMP
decodeF2bTimeStamp
the program, set to "fail2ban"
le programme, cette métadonnée est toujours évaluée à "fail2ban"
PROGRAM
the program's component emitting the log
le composant du programme à l'origine du message
COMPONENT
the protocol for which an action was taken
le protocole pour lequel une action a été appliquée
PROTOCOL
the action taken : ban, or unban
l'action appliquée : bannissement (ban) ou levée du bannissement (unban)
ACTION
the IP address for which the action was taken
l'adresse IP à l'origine de l'action appliquée
SOURCE_IP
2011-09-26 15:12:58,388 fail2ban.actions: WARNING [ssh] Ban 213.65.93.82
fail2ban
actions
ssh
Ban
213.65.93.82
access control
pylogsparser-0.4/normalizers/squid.xml 0000644 0001750 0001750 00000022012 11705765631 016431 0 ustar fbo fbo
This normalizer parses messages issued by the Squid proxy server.
Please note that only Squid's "native log format" is supported by this normalizer.
Ce normaliseur analyse les messages émis par les proxys Squid.
Seul le format "natif" des logs Squid est supporté par ce normaliseur.
mhu@wallix.com
single lexeme without inner spaces
unité sémantique sans espace intersticiel
[^ ]+
if value != "-":
log["user"] = value
This pattern parses Squid's native log format.
Cette structure décrit le format "natif" des logs Squid.
EPOCH +ELAPSED IP CODE/REQUESTSTATUS SIZE METHOD URL USER PEERSTATUS/PEERHOST MIMETYPE
the log EPOCH timestamp
l'horodatage du log au format EPOCH
EPOCH
EPOCH
the user concerned by the request
l'utilisateur concerné par la requête
USER
decode_user
the elapsed time for the request
le temps écoulé pour la requête
ELAPSED
the remote host's IP
l'adresse IP de l'hôte distant
IP
the code returned by the proxy
le code de la réponse émise par le proxy
CODE
the request's status
le statut de la requête
REQUESTSTATUS
the size of the request's result
la taille du résultat de la requête
SIZE
the request's method
la méthode associée à la requête
METHOD
the requested URL
l'URL requêtée
URL
the peer's status
le statut du pair
PEERSTATUS
the peer's host
l'hôte du pair
PEERHOST
the MIME type of the result of the request
le type MIME du résultat de la requête
MIMETYPE
1259844091.407 307 82.238.42.70 TCP_MISS/200 1015 GET http://www.ietf.org/css/ietf.css fbo DIRECT/64.170.98.32 text/css
TCP_MISS
307
82.238.42.70
GET
text/css
64.170.98.32
DIRECT
1015
200
http://www.ietf.org/css/ietf.css
fbo
web proxy
squid
pylogsparser-0.4/normalizers/deny_traffic.xml 0000644 0001750 0001750 00000037107 11705765631 017754 0 ustar fbo fbo
clo@wallix.com
[-a-z0-9]+
GET|OPTIONS|HEAD|POST|PUT|DELETE|TRACE|CONNECT
HTTP/[0-9]+[.][0-9]+
\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}[+]\d{2}
[^,]+
DENYALL_UID,DATE,LOCAL_IP,APP_ID,HOST_HEADER,REMOTE_IP,REMOTE_PORT,FORWARDED_FOR,VIA,REMOTE_USER,USER_AGENT,HTTPS_FLAGS,SSL_PROTOCOL,DN,CERTIFICATE_START,CERTIFICATE_END,HTTP_METHOD,URL_ADDRESS,URL_OPTIONS,HTTP_PROTOCOL_VERSION,HTTP_RESPONSE_CODE,RESPONSE_TIME,BYTES_SENT,BYTES_RECEIVED,REFERER,XCACHE,GZRATIO
DENYALL_UID
DATE
YYYY-MM-DD hh:mm:ss
LOCAL_IP
APP_ID
HOST_HEADER
REMOTE_IP
REMOTE_PORT
FORWARDED_FOR
VIA
REMOTE_USER
USER_AGENT
HTTPS_FLAGS
SSL_PROTOCOL
DN
CERTIFICATE_START
CERTIFICATE_END
HTTP_METHOD
URL_ADDRESS
URL_OPTIONS
HTTP_PROTOCOL_VERSION
HTTP_RESPONSE_CODE
RESPONSE_TIME
BYTES_SENT
BYTES_RECEIVED
REFERER
XCACHE
GZRATIO
1,2011-01-24 18:07:55+01,192.168.80.10,d74ca776-265b-11e0-a54a-000c298895c5,192.168.80.10,192.168.80.1,57548,,,,Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2.13) Gecko/20101203 AskTbTRL2/3.9.1.14019 Firefox/3.6.13,0,,,,,GET,/,,HTTP/1.1,200,215872,1625,409,,,
1
192.168.80.10
d74ca776-265b-11e0-a54a-000c298895c5
192.168.80.10
192.168.80.1
57548
Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2.13) Gecko/20101203 AskTbTRL2/3.9.1.14019 Firefox/3.6.13
0
GET
/
HTTP/1.1
200
215872
1625
409
web proxy
pylogsparser-0.4/normalizers/netfilter.xml 0000644 0001750 0001750 00000015362 11705765631 017312 0 ustar fbo fbo
Netfilter log normalization.
Netfilter logs consist of a list of keys and values. Normalized keys are "in", "out", "mac", "src", "spt", "dst", "dpt", "len", "proto".
Ce normaliseur analyse les logs émis par le composant kernel Netfilter.
Les messages Netfilter consistent en une liste de clés et de valeurs associèes.
Les clés extraites par ce normaliseur sont "in", "out", "mac", "src", "spt", "dst", "dpt", "len", "proto".
fbo@wallix.com
Some typical fields used for log identification.
Quelques champs propres aux logs NETFILTER.
IN=.* OUT=.* SRC=.* DST=.*
ACCEPTED = [ "in", "out", "mac", "src",
"spt", "dst", "dpt", "len", "proto" ]
# Retreive elements separeted by space
elms = value.split()
candidates = [elm for elm in elms if not elm.find('=') == -1 and not elm.endswith('=')]
kv_dict = dict([x.split('=') for x in candidates])
for k,v in kv_dict.items():
kl = k.lower()
if kl in ACCEPTED:
log[kl] = v
TRANSLATE = {'in': 'inbound_int',
'out': 'outbound_int',
'src': 'source_ip',
'dst': 'dest_ip',
'proto': 'protocol',
'spt': 'source_port',
'dpt': 'dest_port'}
for k, v in TRANSLATE.items():
if k in log.keys():
val = log[k]
del log[k]
log[v] = val
if 'mac' in log.keys():
log['dest_mac'] = log['mac'][:17]
log['source_mac'] = log['mac'][18:-6]
del log['mac']
log['program'] = 'netfilter'
kernel
(?:USERPREFIX )?KEYVALUES
a user defined log prefix
un préfixe défini par l'utilisateur
USERPREFIX
Generic Netfilter message with many key-values couples
Message Netfilter générique comportant plusieurs couples clé-valeur
KEYVALUES
decode_netfilter_key_value
*UDP_IN Blocked* IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:15:5d:20:c2:06:08:00 SRC=69.10.39.115 DST=255.255.255.255 LEN=166 TOS=0x00 PREC=0x00 TTL=128 ID=22557 PROTO=UDP SPT=55439 DPT=6112
netfilter
*UDP_IN Blocked*
eth0
ff:ff:ff:ff:ff:ff
00:15:5d:20:c2:06
69.10.39.115
255.255.255.255
166
UDP
55439
6112
firewall
pylogsparser-0.4/normalizers/symantec.xml 0000644 0001750 0001750 00000134355 11705765631 017145 0 ustar fbo fbo
This normalizer parses messages issued by the Symantec Antivirus.
Ce normaliser analyse les messages émis par Symantec Antivirus.
fbo@wallix.com
^(?:[A-F0-9]{2}){6}$
[1-4]
\d{1,3}
[012]?
date_params = [ int(value[2*i : 2*(i+1)], 16) for i in range(6) ]
date_params[0] += 1970
date_params[1] += 1
log['date'] = datetime(*date_params)
# the list below is compliant up to Symantec Endpoint Protection 11.0
events = [ '',
'GL_EVENT_IS_ALERT',
'GL_EVENT_SCAN_STOP',
'GL_EVENT_SCAN_START',
'GL_EVENT_PATTERN_UPDATE',
'GL_EVENT_INFECTION',
'GL_EVENT_FILE_NOT_OPEN',
'GL_EVENT_LOAD_PATTERN',
'GL_STD_MESSAGE_INFO NOT USED',
'GL_STD_MESSAGE_ERROR NOT USED',
'GL_EVENT_CHECKSUM',
'GL_EVENT_TRAP',
'GL_EVENT_CONFIG_CHANGE',
'GL_EVENT_SHUTDOWN',
'GL_EVENT_STARTUP',
'UNDOCUMENTED',
'GL_EVENT_PATTERN_DOWNLOAD',
'GL_EVENT_TOO_MANY_VIRUSES',
'GL_EVENT_FWD_TO_QSERVER',
'GL_EVENT_SCANDLVR',
'GL_EVENT_BACKUP',
'GL_EVENT_SCAN_ABORT',
'GL_EVENT_RTS_LOAD_ERROR',
'GL_EVENT_RTS_LOAD',
'GL_EVENT_RTS_UNLOAD',
'GL_EVENT_REMOVE_CLIENT',
'GL_EVENT_SCAN_DELAYED',
'GL_EVENT_SCAN_RESTART',
'GL_EVENT_ADD_SAVROAMCLIENT_TOSERVER',
'GL_EVENT_REMOVE_SAVROAMCLIENT_FROMSERVER',
'GL_EVENT_LICENSE_WARNING',
'GL_EVENT_LICENSE_ERROR',
'GL_EVENT_LICENSE_GRACE',
'GL_EVENT_UNAUTHORIZED_COMM',
'GL_EVENT_LOG_FWD_THRD_ERR',
'GL_EVENT_LICENSE_INSTALLED',
'GL_EVENT_LICENSE_ALLOCATED',
'GL_EVENT_LICENSE_OK',
'GL_EVENT_LICENSE_DEALLOCATED',
'GL_EVENT_BAD_DEFS_ROLLBACK',
'GL_EVENT_BAD_DEFS_UNPROTECTED',
'GL_EVENT_SAV_PROVIDER_PARSING_ERROR',
'GL_EVENT_RTS_ERROR',
'GL_EVENT_COMPLIANCE_FAIL',
'GL_EVENT_COMPLIANCE_SUCCESS',
'GL_EVENT_SECURITY_SYMPROTECT_POLICYVIOLATION',
'GL_EVENT_ANOMALY_START',
'GL_EVENT_DETECTION_ACTION_TAKEN',
'GL_EVENT_REMEDIATION_ACTION_PENDING',
'GL_EVENT_REMEDIATION_ACTION_FAILED',
'GL_EVENT_REMEDIATION_ACTION_SUCCESSFUL',
'GL_EVENT_ANOMALY_FINISH',
'GL_EVENT_COMMS_LOGIN_FAILED',
'GL_EVENT_COMMS_LOGIN_SUCCESS',
'GL_EVENT_COMMS_UNAUTHORIZED_COMM',
'GL_EVENT_CLIENT_INSTALL_AV',
'GL_EVENT_CLIENT_INSTALL_FW',
'GL_EVENT_CLIENT_UNINSTALL',
'GL_EVENT_CLIENT_UNINSTALL_ROLLBACK',
'GL_EVENT_COMMS_SERVER_GROUP_ROOT_CERT_ISSUE',
'GL_EVENT_COMMS_SERVER_CERT_ISSUE',
'GL_EVENT_COMMS_TRUSTED_ROOT_CHANGE',
'GL_EVENT_COMMS_SERVER_CERT_STARTUP_FAILED',
'GL_EVENT_CLIENT_CHECKIN',
'GL_EVENT_CLIENT_NO_CHECKIN',
'GL_EVENT_SCAN_SUSPENDED',
'GL_EVENT_SCAN_RESUMED',
'GL_EVENT_SCAN_DURATION_INSUFFICIENT',
'GL_EVENT_CLIENT_MOVE',
'GL_EVENT_SCAN_FAILED_ENHANCED',
'GL_EVENT_MAX_EVENT_NUMBER',
'GL_EVENT_HEUR_THREAT_NOW_WHITELISTED',
'GL_EVENT_INTERESTING_PROCESS_DETECTED_START',
'GL_EVENT_LOAD_ERROR_COH',
'GL_EVENT_LOAD_ERROR_SYKNAPPS',
'GL_EVENT_INTERESTING_PROCESS_DETECTED_FINISH',
'GL_EVENT_HPP_SCAN_NOT_SUPPORTED_FOR_OS',
'GL_EVENT_HEUR_THREAT_NOW_KNOWN',
'UNDOCUMENTED',
'UNDOCUMENTED',
'GL_EVENT_MAX_EVENT_NUMBER'] # not super sure about the last one ... ? The kb article seems pasted out of code
log['event_id'] = events[int(value)]
categories = ['','Infection', 'Summary', 'Pattern', 'Security']
log['category'] = categories[int(value)]
loggers = { '0' : 'Scheduled',
'1' : 'Manual',
'2' : 'Real Time',
'6' : 'Console',
'7' : 'VPDOWN',
'8' : 'System',
'9' : 'Startup',
'101' : 'Client',
'102' : 'Forwarded',
'65637' : 'Manual Scan',
'131173' : 'Real Time',
'524389' : 'System',
'720997' : 'Defwatch',
'6619237' : 'Client',
}
log['event_logger_type'] = loggers.get(value, value)
actions = ['',
'Quarantine infected file',
'Rename infected file',
'Delete infected file',
'Leave alone (log only)',
'Clean virus from file',
'Clean or delete macros']
try:
trans = actions[int(value)]
except:
trans = "Unknown action"
log['primary_action_configuration'] = trans
actions = ['',
'Quarantine infected file',
'Rename infected file',
'Delete infected file',
'Leave alone (log only)',
'Clean virus from file',
'Clean or delete macros']
try:
trans = actions[int(value)]
except:
trans = "Unknown action"
log['secondary_action_configuration'] = trans
actions = ['',
'Quarantined',
'Renamed',
'Deleted',
'Left alone',
'Cleaned',
'Cleaned or macros deleted',
'Saved file as...',
'Sent to Intel (AMS)',
'Moved to backup location',
'Renamed backup file',
'Undo action in Quarantine View',
'Write protected or lack of permissions - Unable to act on file',
'Backed up file',
'Pending analysis',
'First action was partially successful; second action was Leave Alone. Results of the second action are not mentioned.',
'A process needs to be terminated to remove a risk',
'Prevent a risk from being loggged or a user interface from being displayed',
'Performing a request to restart the computer',
'Shows as Cleaned by Deletion in the Risk History in the UI and the Logs in the SSC',
'Auto-Protect prevented a file from being created; reported "Access denied."']
log['action'] = actions[int(value)]
virus_index = hex(int(value))[2:]
expanded_threat_index = 0
if len(virus_index) >= 2:
expanded_threat_index = int(virus_index[-2], 16)
virus_types = { '1' : 'VEBOOTVIRUS',
'3' : 'VEBOOT1VIRUS',
'5' : 'VEBOOT2VIRUS',
'9' : 'VEBOOT3VIRUS',
'100' : 'VEFILEVIRUS',
'300' : 'VEMUTATIONVIRUS',
'500' : 'VEFILEMACROVIRUS',
'900' : 'VEFILE2VIRUS',
'1100' : 'VEFIL3VIRUS',
'10000' : 'VEMEMORYVIRUS',
'30000' : 'VEMEMOSVIRUS',
'50000' : 'VEMEMMCBVIRUS',
'90000' : 'VEMEMHIGHESTVIRUS',
'1000000' : 'VEVIRUSBEHAVIOR',
'3000000' : 'VEVIRUS1BEHAVIOR',
'8000000' : 'VEFILECOMPRESSED',
'10000000' : 'VEHURISTIC',
}
expanded_threats = ['',
' + VE_NON_VIRAL_MALICIOUS',
' + VE_RESERVED_MALICIOUS',
' + VE_HEURISTIC',
' + VE_SECURITY_RISK_ON',
' + VE_HACKER_TOOLS',
' + VE_SPYWARE',
' + VE_TRACKWARE',
' + VE_DIALERS',
' + VE_REMOTE_ACCESS',
' + VE_ADWARE',
' + VE_JOKE_PROGRAMS',
' + VE_SECURITY_RISK_OFF',
' + UNDOCUMENTED',
' + UNDOCUMENTED',
' + UNDOCUMENTED',]
res = virus_types.get(virus_index, "UNDOCUMENTED") +\
expanded_threats[expanded_threat_index]
log['virus_type'] = res
flags = {
'4194304': 'EB_ACCESS_DENIED',
'268435456': 'EB_NO_LOG',
'536870912': 'EB_FROM_CLIENT',
'134217728': 'EB_LAST_ITEM',
'16777216': 'EB_LOG',
'33554432': 'EB_REAL_CLIENT',
'4095': 'EB_FA_OVERLAYS',
'4190208': 'EB_N_OVERLAYS',
'67108864': 'EB_FIRST_ITEM',
'8388608': 'EB_REPORT'}
log['eventblock_action'] = flags.get(value, "UNDOCUMENTED")
status = {'0' : 'QF_NONE',
'1' : 'QF_FAILED',
'2' : 'QF_OK'}
log['quarantine_attempt_status'] = status.get(value)
def bits(x):
# helper function to decrypt the mask
if x == 0:
return ()
else:
top_pow = int(math.log(x, 2))
return (top_pow,) + bits(x - 2**top_pow)
flags = ['FA_READ',
'FA_WRITE',
'FA_EXEC',
'FA_IN_TABLE',
'FA_REJECT_ACTION',
'FA_ACTION_COMPLETE',
'FA_DELETE_WHEN_COMPLETE',
'FA_CLIENT_REQUEST',
'FA_OWNED_BY_USER',
'FA_DELETE',
'FA_OWNED_BY_QUEUE',
'FA_FILE_IN_CACHE',
'FA_SCAN',
'FA_GET_TRAP_DATA',
'FA_USE_TRAP_DATA',
'FA_FILE_NEEDS_SCAN',
'FA_BEFORE_OPEN',
'FA_AFTER_OPEN',
'FA_SCAN_BOOT_SECTOR',
'FA_COMING_FROM_NAVAP',
'FA_BACKUP_TO_QUARANTINE']
if value != '0':
ret = ' + '.join( [ flags[i] for i in bits(int(value)) ] )
else:
ret = value
log['operation_flags'] = ret
if value == '0':
ret = "No"
elif value == '1':
ret = "Yes"
else:
ret = "UNDOCUMENTED"
log['compressed_file'] = ret
status = {'0' : 'VECLEANABLE',
'1' : 'VENOCLEANPATTERN',
'2' : 'VENOTCLEANABLE'}
log['cleanable'] = status.get(value, "UNDOCUMENTED")
status = {'0' : 'VEDELETABLE',
'1' : 'VENOTDELETABLE'}
log['deletable'] = status.get(value, "UNDOCUMENTED")
def bits(x):
# helper function to decrypt the mask
if x == 0:
return ()
else:
top_pow = int(math.log(x, 2))
return (top_pow,) + bits(x - 2**top_pow)
meanings = {
0: 'The file could not be opened',
1: 'The file was wiped clean of data',
2: 'The file was truncated to 0 bytes',
3: 'The file could not be deleted',
8: 'Flag created files due to special handling',
9: 'The just created infected file was deleted',
10: 'Dir2-type infected files are not quarantined',
11: 'Dir2-type infected files are deleted if the file is being created',
12: 'Dir2-type infected files are not deleted',
16: 'File was deleted due to the DESTROY flag',
}
if value == '0':
ret = "No information"
else:
ret = " + ".join( [ meanings[i] for i in bits(int(value)) ] )
log['action1_status'] = ret
def bits(x):
# helper function to decrypt the mask
if x == 0:
return ()
else:
top_pow = int(math.log(x, 2))
return (top_pow,) + bits(x - 2**top_pow)
meanings = {
0: 'The file could not be opened',
1: 'The file was wiped clean of data',
2: 'The file was truncated to 0 bytes',
3: 'The file could not be deleted',
8: 'Flag created files due to special handling',
9: 'The just created infected file was deleted',
10: 'Dir2-type infected files are not quarantined',
11: 'Dir2-type infected files are deleted if the file is being created',
12: 'Dir2-type infected files are not deleted',
16: 'File was deleted due to the DESTROY flag',
}
if value == '0':
ret = "No information"
else:
ret = " + ".join( [ meanings[i] for i in bits(int(value)) ] )
log['action2_status'] = ret
Pattern definition for Symantec Antivirus version 8
Definition de pattern pour Symantec Antivirus version 8
DATE,EVENT_NUMBER,CATEGORY,EVENT_LOGGER_TYPE,COMPUTER,USERNAME,VIRUS_NAME,VIRUS_LOCATION,PRIMARY_ACTION_CONFIGURATION,SECONDARY_ACTION_CONFIGURATION,ACTION_TAKEN,VIRUS_TYPE,EVENTBLOCK_ACTION,BODY,SCAN_ID,UNKNOWN1,GROUP_ID,EVENT_DATA,QUARANTINED_FILE_ID,VIRUS_ID,QUARANTINE_ATTEMPT_STATUS,OPERATION_FLAGS,UNKNOWN2,COMPRESSED_FILE,VIRUS_DEPTH_IN_COMPRESSED_FILE,AMOUNT_OF_REMAINING_INFECTED_FILES,VIRUS_DEFINITIONS_VERSION,VIRUS_DEFINITION_SEQUENCE_NUMBER,CLEANABLE,DELETABLE,BACKUP_ID,PARENT,GUID,CLIENT_GROUP,ADDRESS,SERVER_GROUP,DOMAIN_NAME,MAC_ADDRESS,VERSION
DATE
decode_date
Category
Catégorie
EVENT_NUMBER
event_translator
CATEGORY
category_translator
EVENT_LOGGER_TYPE
logger_translator
COMPUTER
Utilisateur
Utilisateur
USERNAME
Virus name
Nom du virus'
VIRUS_NAME
Virus location
Emplacement du virus
VIRUS_LOCATION
PRIMARY_ACTION_CONFIGURATION
action1_translator
SECONDARY_ACTION_CONFIGURATION
action2_translator
Action taken
Action effectuée
ACTION_TAKEN
action0_translator
Virus type
Type du virus
VIRUS_TYPE
virustype_translator
EVENTBLOCK_ACTION
flag_translator
Message body describing the event
Corps du message, décrivant l'événement
BODY
Scan identifier
Identifiant du scan
SCAN_ID
UNKNOWN1
GROUP_ID
EVENT_DATA
QUARANTINED_FILE_ID
Virus identifier
Identifiant du virus
VIRUS_ID
Quarantine attempt status
Statut de la tentative de mise en quarantaine
QUARANTINE_ATTEMPT_STATUS
quarantinest_translator
OPERATION_FLAGS
access_translator
UNKNOWN2
COMPRESSED_FILE
compressed_translator
VIRUS_DEPTH_IN_COMPRESSED_FILE
Amount of remaining infected files
Nombre de fichiers encore infectés
AMOUNT_OF_REMAINING_INFECTED_FILES
Virus definition file version
Version du fichier de définitions des virus
VIRUS_DEFINITIONS_VERSION
VIRUS_DEFINITION_SEQUENCE_NUMBER
CLEANABLE
clean_translator
DELETABLE
delete_translator
BACKUP_ID
PARENT
GUID
CLIENT_GROUP
ADDRESS
SERVER_GROUP
DOMAIN_NAME
MAC Address
Adresse MAC
MAC_ADDRESS
Version
Version
VERSION
symantec
200A13080122,23,2,8,TRAVEL00,SYSTEM,,,,,,,16777216,"Symantec AntiVirus Realtime Protection Loaded.",0,,0,,,,,0,,,,,,,,,,SAMPLE_COMPUTER,,,,Parent,GROUP,,8.0.93330
symantec
2002-11-19 08:01:34
Summary
TRAVEL00
GROUP
System
GL_EVENT_RTS_LOAD
EB_LOG
0
0
SAMPLE_COMPUTER
0
Parent
SYSTEM
8.0.93330
antivirus
Pattern definition for Symantec Antivirus version 9
Definition de pattern pour Symantec Antivirus version 9
DATE,EVENT_NUMBER,CATEGORY,EVENT_LOGGER_TYPE,COMPUTER,USERNAME,VIRUS_NAME,VIRUS_LOCATION,PRIMARY_ACTION_CONFIGURATION,SECONDARY_ACTION_CONFIGURATION,ACTION_TAKEN,VIRUS_TYPE,EVENTBLOCK_ACTION,BODY,SCAN_ID,UNKNOWN1,GROUP_ID,EVENT_DATA,QUARANTINED_FILE_ID,VIRUS_ID,QUARANTINE_ATTEMPT_STATUS,OPERATION_FLAGS,UNKNOWN2,COMPRESSED_FILE,VIRUS_DEPTH_IN_COMPRESSED_FILE,AMOUNT_OF_REMAINING_INFECTED_FILES,VIRUS_DEFINITIONS_VERSION,VIRUS_DEFINITION_SEQUENCE_NUMBER,CLEANABLE,DELETABLE,BACKUP_ID,PARENT,GUID,CLIENT_GROUP,ADDRESS,SERVER_GROUP,DOMAIN_NAME,MAC_ADDRESS,VERSION,REMOTE_MACHINE,REMOTE_MACHINE_IP,ACTION1_STATUS,ACTION2_STATUS,LICENSE_FEATURE_NAME,LICENSE_FEATURE_VER,LICENSE_SERIAL_NUM,LICENSE_FULFILLMENT_ID,LICENSE_START_DT,LICENSE_EXPIRATION_DT,LICENSE_LIFECYCLE,LICENSE_SEATS_TOTAL,LICENSE_SEATS,ERR_CODE,LICENSE_SEATS_DELTA,STATUS,DOMAIN_GUID,LOG_SESSION_GUID,VBIN_SESSION_GUID,LOGIN_DOMAIN
DATE
decode_date
Category
Catégorie
EVENT_NUMBER
event_translator
CATEGORY
category_translator
EVENT_LOGGER_TYPE
logger_translator
COMPUTER
Utilisateur
Utilisateur
USERNAME
Virus name
Nom du virus'
VIRUS_NAME
Virus location
Emplacement du virus
VIRUS_LOCATION
PRIMARY_ACTION_CONFIGURATION
action1_translator
SECONDARY_ACTION_CONFIGURATION
action2_translator
Action taken
Action effectuée
ACTION_TAKEN
action0_translator
Virus type
Type du virus
VIRUS_TYPE
virustype_translator
EVENTBLOCK_ACTION
flag_translator
Message body describing the event
Corps du message, décrivant l'événement
BODY
Scan identifier
Identifiant du scan
SCAN_ID
UNKNOWN1
GROUP_ID
EVENT_DATA
QUARANTINED_FILE_ID
Virus identifier
Identifiant du virus
VIRUS_ID
Quarantine attempt status
Statut de la tentative de mise en quarantaine
QUARANTINE_ATTEMPT_STATUS
quarantinest_translator
OPERATION_FLAGS
access_translator
UNKNOWN2
COMPRESSED_FILE
compressed_translator
VIRUS_DEPTH_IN_COMPRESSED_FILE
Amount of remaining infected files
Nombre de fichiers encore infectés
AMOUNT_OF_REMAINING_INFECTED_FILES
Virus definition file version
Version du fichier de définitions des virus
VIRUS_DEFINITIONS_VERSION
VIRUS_DEFINITION_SEQUENCE_NUMBER
CLEANABLE
clean_translator
DELETABLE
delete_translator
BACKUP_ID
PARENT
GUID
CLIENT_GROUP
ADDRESS
SERVER_GROUP
DOMAIN_NAME
MAC Address
Adresse MAC
MAC_ADDRESS
Version
Version
VERSION
REMOTE_MACHINE
REMOTE_MACHINE_IP
ACTION1_STATUS
action1s9_translator
ACTION2_STATUS
action2s9_translator
LICENSE_FEATURE_NAME
LICENSE_FEATURE_VER
LICENSE_SERIAL_NUM
LICENSE_FULFILLMENT_ID
LICENSE_START_DT
LICENSE_EXPIRATION_DT
LICENSE_LIFECYCLE
LICENSE_SEATS_TOTAL
LICENSE_SEATS
ERR_CODE
LICENSE_SEATS_DELTA
STATUS
DOMAIN_GUID
LOG_SESSION_GUID
VBIN_SESSION_GUID
LOGIN_DOMAIN
symantec
200A13080122,23,2,8,TRAVEL00,SYSTEM,,,,,,,16777216,"Symantec AntiVirus Realtime Protection Loaded.",0,,0,,,,,0,,,,,,,,,,SAMPLE_COMPUTER,,,,Parent,GROUP,,9.0.93330,,,,,,,,,,,,,,,,,,,,
symantec
2002-11-19 08:01:34
Summary
TRAVEL00
GROUP
System
GL_EVENT_RTS_LOAD
EB_LOG
0
0
SAMPLE_COMPUTER
0
Parent
SYSTEM
9.0.93330
antivirus
pylogsparser-0.4/normalizers/RefererParser.xml 0000644 0001750 0001750 00000007426 11645625573 020072 0 ustar fbo fbo
This normalizer extracts additional info from URLs such as domain, protocol, etc.
Ce normaliseur extrait des données supplémentaires des URLs telles que le domaine, le protocole, etc.
mhu@wallix.com
parsed = urlparse.urlparse(value)
if parsed.hostname:
log['referer_hostname'] = parsed.hostname
# naive approach
if len(parsed.hostname.split('.')) < 2:
domain = None
else:
domain = '.'.join(parsed.hostname.split('.')[1:])
log['referer_domain'] = domain or parsed.hostname
if parsed.path:
log['referer_path'] = parsed.path
URL
URL
decodeURL
http://www.wallix.org/2011/09/20/how-to-use-linux-containers-lxc-under-debian-squeeze/
www.wallix.org
/2011/09/20/how-to-use-linux-containers-lxc-under-debian-squeeze/
wallix.org
pylogsparser-0.4/normalizers/syslog.xml 0000644 0001750 0001750 00000020626 11673644166 016640 0 ustar fbo fbo
This normalizer is used to parse syslog lines, as defined in RFC3164.
Priority, when present, is broken into the facility and severity codes.
Ce normaliseur traite les événements au format syslog, tel qu'il est défini dans la RFC3164.
Si le message contient une information de priorité, celle-ci est décomposée en deux valeurs : facilité et gravité.
mhu@wallix.com
Expression matching a syslog line priority, defined as 8*facility + severity.
Expression correspondant à la priorité du message, suivant la formule 8 x facilité + gravité.
\d{1,3}
Expression matching the log's source.
Expression correspondant à la source du message.
[^: ]+
Expression matching the log's program.
Expression correspondant au programme notifiant l'événement.
[^: []*
# define facilities
FACILITIES = { 0: "kernel",
1: "user",
2: "mail",
3: "daemon",
4: "auth",
5: "syslog",
6: "print",
7: "news",
8: "uucp",
9: "ntp",
10: "secure",
11: "ftp",
12: "ntp",
13: "audit",
14: "alert",
15: "ntp" }
for i in range(0, 8):
FACILITIES[i+16] = "local%d" % i
# define severities
SEVERITIES = { 0: "emerg",
1: "alert",
2: "crit",
3: "error",
4: "warn",
5: "notice",
6: "info",
7: "debug" }
facility = int(value) / 8
severity = int(value) % 8
if facility not in FACILITIES or severity not in SEVERITIES:
raise ValueError('facility or severity is out of range')
log["facility"] = "%s" % FACILITIES[facility]
log["severity"] = "%s" % SEVERITIES[severity]
log["facility_code"] = "%d" % facility
log["severity_code"] = "%d" % severity
A syslog line with optional priority (sent through network), source, program and optional PID.
Une ligne de log encapsulée par syslog comprenant une priorité (optionnelle), une source, un programme et un PID (optionnel).
(?:<PRIORITY>)?DATE SOURCE PROGRAM(?:\[PID\])?: BODY
the log's priority
la priorité du log, égale à 8 x facilité + gravité
PRIORITY
decode_priority
the log's date
l'horodatage du log par le démon syslog
DATE
MMM dd hh:mm:ss
the log's source
l'équipement d'origine de l'événement
SOURCE
the log's program
le programme à l'origine de l'événement
PROGRAM
the program's process ID
le PID du programme
PID
the actual event message
le message décrivant l'événement
BODY
<29>Jul 18 08:55:35 naruto dhclient[2218]: bound to 10.10.4.11 -- renewal in 2792 seconds.
daemon
notice
naruto
dhclient
2218
bound to 10.10.4.11 -- renewal in 2792 seconds.
A syslog line with optional priority (sent through network), source, and no information about program and PID.
Une ligne de log encapsulée par syslog comprenant une priorité (optionnelle), une source, et pas d'information sur le programme.
(?:<PRIORITY>)?DATE SOURCE BODY
the log's priority
la priorité du log, égale à 8 x facilité + gravité
PRIORITY
decode_priority
the log's date
l'horodatage du log par le démon syslog
DATE
MMM dd hh:mm:ss
the log's source
l'équipement d'origine de l'événement
SOURCE
the actual event message
le message décrivant l'événement
BODY
<29>Jul 18 08:55:35 naruto bound to 10.10.4.11 -- renewal in 2792 seconds.
daemon
notice
naruto
bound to 10.10.4.11 -- renewal in 2792 seconds.
pylogsparser-0.4/normalizers/dhcpd.xml 0000644 0001750 0001750 00000020507 11705765631 016375 0 ustar fbo fbo
This normalizer is used to parse DHCPd messages.
Ce normaliseur analyse les messages émis par les serveurs DHCPd.
mhu@wallix.com
Expression matching a single word or lexeme.
Expression correspondant à un mot sans espace intersticiel.
[^ ]+
Expression matching the action notified by the DCHP daemon.
Expression correspondant à l'action DHCP.
DHCP[A-Z]+
log["action"] = value[4:]
Generic DHCP discovery message.
Structure générique d'un message de découverte DHCP.
DHCPACTION from MACADDRESS via ADDRESS
DHCPACTION
decode_action
MACADDRESS
ADDRESS
DHCPDISCOVER from 02:1c:25:a3:32:76 via 183.213.184.122
DISCOVER
02:1c:25:a3:32:76
183.213.184.122
address assignation
Generic DHCP inform message.
Message générique informatif.
DHCPACTION from IP
DHCPACTION
decode_action
IP
DHCPINFORM from 183.231.184.122
INFORM
183.231.184.122
address assignation
Other DHCP messages : offer, request, acknowledge, non-acknowledge, decline, release.
Autres messages DHCP : offre de bail, requête, confirmation, réfutation, refus, libération de bail.
DHCPACTION [a-z]+ IP [a-z]+ MACADDRESS via VIA
DHCPACTION
decode_action
IP
MACADDRESS
VIA
DHCPOFFER on 183.231.184.122 to 00:13:ec:1c:06:5b via 183.213.184.122
OFFER
183.231.184.122
00:13:ec:1c:06:5b
183.213.184.122
address assignation
DHCPREQUEST for 183.231.184.122 from 00:13:ec:1c:06:5b via 183.213.184.122
REQUEST
183.231.184.122
00:13:ec:1c:06:5b
183.213.184.122
address assignation
pylogsparser-0.4/normalizers/bitdefender.xml 0000644 0001750 0001750 00000053016 11705765631 017567 0 ustar fbo fbo
This normalizer parses BitDefender (Mail servers UNIX) logs.
Ce normaliseur analyse les logs de BitDefender (version Mail servers UNIX).
fbo@wallix.com
.*
action, action_info = value.split()
log['action'] = action
log['action_info'] = action_info
log['stamp'] = value
r1 = re.compile('.*, hit signature: (?P<sign>.*), .*')
m1 = r1.match(value)
if m1:
log['reason_detail'] = m1.groupdict()['sign']
log['reason'] = 'signature'
return
r2 = re.compile('.*, blacklisted, .*')
m2 = r2.match(value)
if m2:
log['reason'] = 'blacklisted'
return
r3 = re.compile('.*, URI DNSBL: \[(?P<reporter>.*)\], .*')
m3 = r3.match(value)
if m3:
log['reason_detail'] = m3.groupdict()['reporter']
log['reason'] = 'URI DNSBL'
return
r4 = re.compile('.*, spam url, .*')
m4 = r4.match(value)
if m4:
log['reason'] = 'spam url'
return
r5 = re.compile('.*, SQMD Hits: (?P<hits>.*) , .*')
m5 = r5.match(value)
if m5:
log['reason_detail'] = m5.groupdict()['hits']
log['reason'] = 'SQMD Hits'
return
log['body'] = log['raw'].split(': ', 1)[1]
Logs contained in spam.log file.
Logs contenus dans le fichier spam.log.
DATE BDMAILD SPAM: sender: SENDER, recipients: RECIPIENTS, sender IP: SADDR, subject: "SUBJECT", score: SCORE, stamp: "STAMP", agent: AGENT, action: ACTION, header recipients: HRECIPS, headers: HEADERS, group: "GROUP"
The time at which the spam was detected.
La date à laquelle le spam a été détécté.
DATE
MM/dd/YYYY hh:mm:ss
extract_body
The mail sender.
L'expéditeur de mail.
SENDER
The mail recipients list.
La liste des mails destinataires.
RECIPIENTS
Client IP address.
L'adresse IP du client.
SADDR
The mail subject.
Le sujet du mail.
SUBJECT
SCORE
Spam identification informations.
Informations d'identifications du spam.
STAMP
extract_spam_reason
AGENT
Action taken by BitDefender.
Action prise par BitDefender.
ACTION
decode_action
HRECIPS
HEADERS
GROUP
bitdefender
12/08/2010 11:18:42 BDMAILD SPAM: sender: bounces+333785.61449158.669496@icpbounce.com, recipients: jack@corp.com, sender IP: 127.0.0.1, subject: "=?iso-8859-1?Q?N=B07_sur_7_de_votre_s=E9rie_sur_le_management_du_changeme?= =?iso-8859-1?Q?nt?=", score: 1000, stamp: " v1, build 2.8.60.118893, rbl score: 0(0), hit signature: AUTO_B_IPX_20100613_110223_1_555, total: 1000(775)", agent: Smtp Proxy 3.1.3, action: drop (move-to-quarantine;drop), header recipients: ( "jack@corp.com" ), headers: ( "Received: from localhost [127.0.0.1] by BitDefender SMTP Proxy on localhost [127.0.0.1] for localhost [127.0.0.1]; Wed, 8 Dec 2010 11:18:42 +0100 (CET)" "Received: from paris.office.corp.com (unknown [10.10.1.254]) by as-bd-64.ifr.lan (Postfix) with ESMTP id 305B28A001 for <jack@corp.com>; Wed, 8 Dec 2010 11:18:42 +0100 (CET)" "Received: from smtp16.icpbounce.com (smtp16.icpbounce.com [216.27.93.110]) by paris.office.corp.com (Postfix) with ESMTP id 746D86A423B for <jack@corp.com>; Wed, 8 Dec 2010 11:17:48 +0100 (CET)" "Received: from drone21.rtp.icpbounce.com (agent004.colo.icontact.com [172.27.2.15]) by smtp16.icpbounce.com (Postfix) with ESMTP id 4C5653C7327 for <jack@corp.com>; Wed, 8 Dec 2010 05:15:46 -0500 (EST)" "Received: from localhost.localdomain (unknown [127.0.0.1]) by drone21.rtp.icpbounce.com (Postfix) with ESMTP id 8ED7022BD6 for <jack@corp.com>; Wed, 8 Dec 2010 05:10:39 -0500 (EST)" ), group: "Default"
2010-12-08 11:18:42
1000
Smtp Proxy 3.1.3
bounces+333785.61449158.669496@icpbounce.com
jack@corp.com
=?iso-8859-1?Q?N=B07_sur_7_de_votre_s=E9rie_sur_le_management_du_changeme?= =?iso-8859-1?Q?nt?=
v1, build 2.8.60.118893, rbl score: 0(0), hit signature: AUTO_B_IPX_20100613_110223_1_555, total: 1000(775)
drop
(move-to-quarantine;drop)
Default
AUTO_B_IPX_20100613_110223_1_555
signature
antivirus
10/20/2011 10:01:19 BDMAILD SPAM: sender: debimelva@albaad.com, recipients: djoume@corp.com;lchapuis@cpr.com;matallah@corp.com;mhoulbert@corp.com;rca@corp.com;sales@corp.com;sset@corp.com;steph@corp.com;vbe@corp.com, sender IP: 127.0.0.1, subject: "Replica watches - THE MOST POPULAR MODELS All our replica watches have the same look and feel of the original product", score: 1000, stamp: " v1, build 2.10.1.12405, rbl score: 0(0), hit signature: S_REPL_IPX_080830_02, total: 1000(750)", agent: Smtp Proxy 3.1.3, action: drop (move-to-quarantine;drop), header recipients: ( "<sset@corp.com>" ), headers: ( "Received: from localhost [127.0.0.1] by BitDefender SMTP Proxy on localhost [127.0.0.1] for localhost [127.0.0.1]; Thu, 20 Oct 2011 10:01:19 +0200 (CEST)" "Received: from paris.office.corp.com (go.corp.lan [10.10.1.254]) by as-bd-64.ifr.lan (Postfix) with ESMTP id 5AB6E1C7; Thu, 20 Oct 2011 10:01:19 +0200 (CEST)" "Received: from wfxamsklgv25z.py5nq1lz4i.com (unknown [190.234.5.86]) by paris.office.corp.com (Postfix) with SMTP id 006366A4895; Thu, 20 Oct 2011 09:54:40 +0200 (CEST)" ), group: "Default"
2011-10-20 10:01:19
1000
debimelva@albaad.com
djoume@corp.com;lchapuis@cpr.com;matallah@corp.com;mhoulbert@corp.com;rca@corp.com;sales@corp.com;sset@corp.com;steph@corp.com;vbe@corp.com
drop
Default
v1, build 2.10.1.12405, rbl score: 0(0), hit signature: S_REPL_IPX_080830_02, total: 1000(750)
S_REPL_IPX_080830_02
signature
antivirus
10/20/2011 16:07:40 BDMAILD SPAM: sender: 2363840z15263@bounce.crugeman.net, recipients: presse@corp.com, sender IP: 127.0.0.1, subject: "Conventions collectives nationales", score: 1000, stamp: " v1, build 2.10.1.12405, SQMD Hits: Spam FuzzyHit CRT_BGU , rbl score: 0(0), apm score: 500, SQMD: 6e74b86f401125abf381712e9dcc808e.fuzzy.fzrbl.org, total: 1000(750)", agent: Smtp Proxy 3.1.3, action: drop (move-to-quarantine;drop), header recipients: ( "<presse@corp.com>" ), headers: ( "Received: from localhost [127.0.0.1] by BitDefender SMTP Proxy on localhost [127.0.0.1] for localhost [127.0.0.1]; Thu, 20 Oct 2011 16:07:39 +0200 (CEST)" "Received: from paris.office.corp.com (go.corp.lan [10.10.1.254]) by as-bd-64.ifr.lan (Postfix) with ESMTP id BE4641C7 for <presse@corp.com>; Thu, 20 Oct 2011 16:07:39 +0200 (CEST)" "Received: from mx01.crugeman.net (mx01.crugeman.net [195.43.150.178]) by paris.office.corp.com (Postfix) with ESMTP id DF33E6A42A4 for <presse@corp.com>; Thu, 20 Oct 2011 16:01:10 +0200 (CEST)" "Received: by mx01.crugeman.net (Postfix, from userid 0) id C57BE89416; Thu, 20 Oct 2011 16:01:09 +0200 (CEST)" ), group: "Default"
2011-10-20 16:07:40
1000
presse@corp.com
drop
Default
v1, build 2.10.1.12405, SQMD Hits: Spam FuzzyHit CRT_BGU , rbl score: 0(0), apm score: 500, SQMD: 6e74b86f401125abf381712e9dcc808e.fuzzy.fzrbl.org, total: 1000(750)
Spam FuzzyHit CRT_BGU
SQMD Hits
bitdefender
antivirus
Logs contained in update.log file.
Logs contenus dans le fichier update.log.
DATE BDLIVED INFO: .*
The time at which the event was detected.
La date à laquelle l'événement a été détécté.
DATE
MM/dd/YYYY hh:mm:ss
extract_body
bitdefender
10/24/2011 15:33:30 BDLIVED INFO: Downloading files for location 'antispam_sig_nx' from 'upgrade.bitdefender.com'
2011-10-24 15:33:30
bitdefender
antivirus
Logs contained in mail.log file.
Logs contenus dans le fichier mail.log.
DATE BDMAILD INFO: .*
The time at which the event was detected.
La date à laquelle l'événement a été détécté.
DATE
MM/dd/YYYY hh:mm:ss
extract_body
bitdefender
10/24/2011 13:33:11 BDMAILD INFO: cannot use an empty footer
2011-10-24 13:33:11
bitdefender
antivirus
Logs contained in error.log file.
Logs contenus dans le fichier error.log.
DATE BDSCAND ERROR: .*
The time at which the event was detected.
La date à laquelle l'événement a été détécté.
DATE
MM/dd/YYYY hh:mm:ss
extract_body
bitdefender
10/24/2011 04:31:39 BDSCAND ERROR: failed to initialize the AV core
2011-10-24 04:31:39
bitdefender
antivirus
pylogsparser-0.4/normalizers/snare.xml 0000644 0001750 0001750 00000043056 11673644166 016432 0 ustar fbo fbo
This normalizer handles event logs sent by Snare agent for Windows.
Ce normaliser analyse les logs envoyés par Snare agent for Windows
clo@wallix.com
String containing Windows' authorized characters for computers, users, etc...
Chaîne contenant les caractères autorisés de Windows pour les noms d'ordinateur, utilisateurs etc..
[^\t]+|(?:N/A)
'MSWinEventLog' for Snare for Windows.
'MSWinEventLog' pour Snare for Windows.
MSWinEventLog
Criticality tag is a number between 0 and 4.
Le tag criticité est un nombre entre 0 et 4.
[0-4]
After different tests, event_log_source is a string containing [a-zA-Z-_]+, but it may be able to contain more than those characters. Change regexp in that case.
Après plusieurs tests, event_log_source est une chaîne contenant [a-zA-Z-_]+ mais elle pourrait contenir d'autres caracrères, il faudra changer la regexp dans ce cas.
[a-zA-Z]+(?:[ -][a-zA-Z]+)*
This is the type of SID used.
Type de SID utilisé.
(?:[A-Z][a-zA-Z]+(?: [A-Z][a-zA-Z]+)*)|(?:N/A)
String that can be anyone of the following regexp.
Chaîne pouvant être une des regexp suivantes.
Success Audit|Failure Audit|Error|Warning|Information
String of the different Windows' audit event category.
Chaîne d'une des catégories d'audit event de Windows.
[^\t]+|(?:N/A)
Hexadecimal number.
Nombre héxadécimal.
[0-9a-fA-F]{32}
url = "http://www.microsoft.com/technet/support/ee/SearchResults.aspx?Type=0&Message="
log['technet_link'] = url + str(value)
This is the Snare log format.
Description du format des logs Snare.
SNARE_EVENT_LOG_TYPE\s+CRITICALITY\s+SOURCE_NAME\s+SNARE_EVENT_COUNTER\s+[a-zA-Z]{3}. \w+ [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{3}\s+EVENT_ID\s+EXPANDED_SOURCENAME\sUSER_NAME\s+SID_TYPE\s+EVENT_LOGTYPE\s+COMPUTER_NAME\s+CATEGORY_STRING\s+DATA_STRING(?:\s+MD5_CHECKSUM)?
'MSWinEventLog' for Snare for Windows.
'MSWinEventLog' pour Snare for Windows.
SNARE_EVENT_LOG_TYPE
This is determinated by the Alert level given to the objective by the user. (In Snare Agent)
Niveau d'alerte configuré depuis les objectives de l'agent Snare.
CRITICALITY
This is the Windows Event Log from which the event record was derived.
Nom du journal Windows d'où vient l'évènement.
SOURCE_NAME
Based on the internal Snare event counter.
Basé sur le compteur interne des évènements de Snare.
SNARE_EVENT_COUNTER
This is the Windows event ID.
ID de l'évènement Windows.
EVENT_ID
add_technet_link
This is the Windows Event Log from which the event record was derived.
Nom du journal Windows d'où vient l'évènement.
EXPANDED_SOURCENAME
This is the Windows' user name.
Nom d'utilisateur Windows.
USER_NAME
This is the type of SID used.
Type de SID utilisé.
SID_TYPE
This can be anyone of 'Success Audit', 'Failure Audit', 'Error', 'Information', 'Warning'.
Peut être un des suivants: 'Success Audit', 'Failure Audit', 'Error', 'Information', 'Warning'.
EVENT_LOGTYPE
This is the Windows computer name.
Nom de l'ordinateur.
COMPUTER_NAME
This is the Windows computer name.
Nom de l'ordinateur.
CATEGORY_STRING
This contains the data string.
Contient la chaîne de donnée.
DATA_STRING
This is a MD5 checksum.
Empreinte MD5.
MD5_CHECKSUM
clo-PC MSWinEventLog 0 Security 191 mer. août 24 14:20:19 201 4688 Microsoft-Windows-Security-Auditing WORKGROUP\CLO-PC$ N/A Success Audit clo-PC Création du processus Un nouveau processus a été créé. Sujet : ID de sécurité : S-1-5-18 Nom du compte : CLO-PC$ Domaine du compte : WORKGROUP ID d’ouverture de session : 0x3e7 Informations sur le processus : ID du nouveau processus : 0x654 Nom du nouveau processus : C:\Windows\servicing\TrustedInstaller.exe Type d’élévation du jeton : Type d’élévation de jeton par défaut (1) ID du processus créateur : 0x1c8 Le type d’élévation du jeton indique le type de jeton qui a été attribué au nouveau processus conformément à la stratégie de contrôle du compte d’utilisateur. Le type 1 est un jeton complet sans aucun privilège supprimé ni aucun groupe désactivé. Un jeton complet est uniquement utilisé si le contrôle du compte d’utilisateur est désactivé, ou si l’utilisateur est le compte d’administrateur intégré ou un compte de service. Le type 2 est un jeton aux droits élevés sans aucun privilège supprimé ni aucun groupe désactivé. Un jeton aux droits élevés est utilisé lorsque le contrôle de compte d’utilisateur est activé et que l’utilisateur choisit de démarrer le programme en tant qu’administrateur. Un jeton aux droits élevés est également utilisé lorsqu’une application est configurée pour toujours exiger un privilège administratif ou pour toujours exiger les privilèges maximum, et que l’utilisateur est membre du groupe Administrateurs. Le type 3 est un jeton limité dont les privilèges administratifs sont supprimés et les groupes administratifs désactivés. Le jeton limité est utilisé lorsque le contrôle de compte d’ utilisateur est activé, que l’application n’exige pas le privilège administratif et que l’utilisateur ne choisit pas de démarrer le programme en tant qu’administrateur. 133
MSWinEventLog
0
Security
191
4688
Microsoft-Windows-Security-Auditing
WORKGROUP\CLO-PC$
N/A
Success Audit
clo-PC
Création du processus
Un nouveau processus a été créé. Sujet : ID de sécurité : S-1-5-18 Nom du compte : CLO-PC$ Domaine du compte : WORKGROUP ID d’ouverture de session : 0x3e7 Informations sur le processus : ID du nouveau processus : 0x654 Nom du nouveau processus : C:\Windows\servicing\TrustedInstaller.exe Type d’élévation du jeton : Type d’élévation de jeton par défaut (1) ID du processus créateur : 0x1c8 Le type d’élévation du jeton indique le type de jeton qui a été attribué au nouveau processus conformément à la stratégie de contrôle du compte d’utilisateur. Le type 1 est un jeton complet sans aucun privilège supprimé ni aucun groupe désactivé. Un jeton complet est uniquement utilisé si le contrôle du compte d’utilisateur est désactivé, ou si l’utilisateur est le compte d’administrateur intégré ou un compte de service. Le type 2 est un jeton aux droits élevés sans aucun privilège supprimé ni aucun groupe désactivé. Un jeton aux droits élevés est utilisé lorsque le contrôle de compte d’utilisateur est activé et que l’utilisateur choisit de démarrer le programme en tant qu’administrateur. Un jeton aux droits élevés est également utilisé lorsqu’une application est configurée pour toujours exiger un privilège administratif ou pour toujours exiger les privilèges maximum, et que l’utilisateur est membre du groupe Administrateurs. Le type 3 est un jeton limité dont les privilèges administratifs sont supprimés et les groupes administratifs désactivés. Le jeton limité est utilisé lorsque le contrôle de compte d’ utilisateur est activé, que l’application n’exige pas le privilège administratif et que l’utilisateur ne choisit pas de démarrer le programme en tant qu’administrateur. 133
MSWinEventLog 0 Security 313 ven. août 26 15:42:40 201 4689 Microsoft-Windows-Security-Auditing AUTORITE NT\SERVICE LOCAL N/A Success Audit a-zA-Z0-9_ Fin du processus Un processus est terminé. Sujet : ID de sécurité : S-1-5-19 Nom du compte : SERVICE LOCAL Domaine du compte : AUTORITE NT ID d’ouverture de session : 0x3e5 Informations sur le processus : ID du processus : 0xdf4 Nom du processus : C:\Windows\System32\taskhost.exe État de fin : 0x0 189
MSWinEventLog
0
Security
313
4689
Microsoft-Windows-Security-Auditing
AUTORITE NT\SERVICE LOCAL
N/A
Success Audit
a-zA-Z0-9_
Fin du processus
Un processus est terminé. Sujet : ID de sécurité : S-1-5-19 Nom du compte : SERVICE LOCAL Domaine du compte : AUTORITE NT ID d’ouverture de session : 0x3e5 Informations sur le processus : ID du processus : 0xdf4 Nom du processus : C:\Windows\System32\taskhost.exe État de fin : 0x0 189
>13<Aug 31 15:42:55 clo-vbox-win-7 MSWinEventLog 1 Security 103 mer. août 31 15:42:54 201 4776 Microsoft-Windows-Security-Auditing clo N/A Failure Audit clo-vbox-win-7 Validation des informations d’identification L’ordinateur a tenté de valider les informations d’identification d’un compte. Package d’authentification : MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Compte d’ouverture de session : clo Station de travail source : CLO-VBOX-WIN-7 Code d’erreur : 0xc000006e 77
MSWinEventLog
1
Security
103
4776
Microsoft-Windows-Security-Auditing
clo
N/A
Failure Audit
clo-vbox-win-7
Validation des informations d’identification
L’ordinateur a tenté de valider les informations d’identification d’un compte. Package d’authentification : MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Compte d’ouverture de session : clo Station de travail source : CLO-VBOX-WIN-7 Code d’erreur : 0xc000006e 77
EventLog
pylogsparser-0.4/normalizers/dansguardian.xml 0000644 0001750 0001750 00000101126 11705765631 017750 0 ustar fbo fbo
This normalizer parses DansGuardian's access.log file. This
file logs every request made, whether allowed or denied, and gives the reason why a specific action
was taken.
Ce normaliseur traite le contenu du fichier access.log
utilisé par DansGuardian pour consigner les requêtes d'accès et le résultat associé.
mhu@wallix.com
the standard value format
le format utilisé pour les différents champs du log
[^\t]+
the date as it is logged by DansGuardian
la date telle qu'elle est consignée par DansGuardian
\d{4}\.\d{1,2}\.\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}
one of the HTTP methods defined by W3C
une méthode HTTP parmi celles définies par le W3C
GET|HEAD|CHECKOUT|SHOWMETHOD|PUT|DELETE|POST|LINK|UNLINK|CHECKIN|TEXTSEARCH|SPACEJUMP|SEARCH
the possible actions taken by dansguardian
les actions que peut appliquer dansguardian
(?:[*][A-Z]+[*] ?)*
r = re.compile("(?P<year>\d{4})\.(?P<month>\d{1,2})\.(?P<day>\d{1,2}) (?P<hour>\d{1,2}):(?P<minute>\d{1,2}):(?P<second>\d{1,2})")
m = r.match(value).groupdict()
m = dict( [(u, int(m[u])) for u in m.keys() ] )
log['date'] = datetime(**m)
if "DENIED" in value:
log['action'] = "DENIED"
elif "EXCEPTION" in value:
log['action'] = "EXCEPTION"
if "INFECTED" in value:
log['scan_result'] = "infected"
# the following tags are not the "official" designation. There is no official designation actually.
if "SCANNED" in value:
log['scan_result'] = "clean"
if "CONTENTMOD" in value:
log['content_modified'] = "true"
else:
log['content_modified'] = "false"
if "URLMOD" in value:
log['url_modified'] = "true"
else:
log['url_modified'] = "false"
if "HEADERMOD" in value:
log['header_modified'] = "true"
else:
log['header_modified'] = "false"
The standard access.log line pattern.
Le format usuel d'une ligne de log issue du fichier access.log.
WHEN WHO FROM WHERE WHAT WHY HOW SIZE NAUGHTINESS (?:CATEGORY)? FILTERGROUPNUMBER RETURNCODE (?:MIMETYPE)? (?:CLIENTNAME)? (?:FILTERGROUPNAME)? (?:USERAGENT)?
the log's date
la date du log
WHEN
decode_DGDate
a user or a computer, if an "authplugin" has identified it
un nom d'utilisateur ou d'équipement, s'il a été identifié par un "authplugin"
WHO
the IP address of the requestor
l'adresse IP d'origine de la requête
FROM
the complete requested URL
l'URL de la requête
WHERE
the list of actions taken by dansguardian, as it appears in the log file. This list is refined in relevant tags such as "action", "scan_result", "url_modified", "content_modified" and "header_modified" when applicable.
la liste des actions prises par dansguardian, telle qu'elle apparaît dans le fichier de log. Cette liste sert à définir d'autres tags : "action", "scan_result", "url_modified", "content_modified" et "header_modified" quand cela est pertinent.
WHAT
decode_DGactions
why the actions were taken
la raison pour laquelle les actions ont été appliquées
WHY
the HTTP request verb
la méthode HTTP
HOW
the size in bytes of the document, if fetched
la taille du document en bytes, si il a été récupéré
SIZE
the sum of all the weighted phrase scores
le score total d'inadéquation
NAUGHTINESS
the contents of the #listcategory tag, if any, in the list that is most relevant to the action
le contenu éventuel de la métadonnée #listcategory la plus pertinente par rapport à l'action
CATEGORY
the filter group the request was assigned to
le groupe de filtrage auquel la requête a été assignée
FILTERGROUPNUMBER
the HTTP return code
le code HTTP de la réponse
RETURNCODE
the MIME type, if relevant, of the returned document
le type MIME, si applicable, de la réponse
MIMETYPE
if configured, the result of a reverse DNS IP lookup on the requestor's IP address
si activée, la résolution DNS inversée de l'IP d'origine de la requête
CLIENTNAME
the name of the filter group
le nom du groupe de filtrage
FILTERGROUPNAME
the browser's user agent string
la valeur du champ "user agent" exposée par le navigateur
USERAGENT
2011.12.13 7:38:50 10.10.42.23 10.10.42.23 http://backports.debian.org/debian-backports/dists/squeeze-backports/main/binary-i386/Packages.diff/2011-12-02-1137.04.gz *DENIED* Type de fichier interdit: .gz GET 0 0 Banned extension 2 403 application/x-gzip limited_access -
10.10.42.23
10.10.42.23
http://backports.debian.org/debian-backports/dists/squeeze-backports/main/binary-i386/Packages.diff/2011-12-02-1137.04.gz
*DENIED*
DENIED
Type de fichier interdit: .gz
GET
0
0
Banned extension
2
403
application/x-gzip
limited_access
-
web proxy
2011.12.13 12:10:48 10.10.42.23 10.10.42.23 http://safebrowsing-cache.google.com/safebrowsing/rd/ChNnb29nLW1hbHdhcmUtc2hhdmFyEAEY9p8EIPafBDIF9g8BAAE *EXCEPTION* Site interdit trouvé. GET 326 0 2 200 - limited_access -
10.10.42.23
10.10.42.23
http://safebrowsing-cache.google.com/safebrowsing/rd/ChNnb29nLW1hbHdhcmUtc2hhdmFyEAEY9p8EIPafBDIF9g8BAAE
*EXCEPTION*
EXCEPTION
Site interdit trouvé.
GET
326
0
2
200
-
limited_access
-
web proxy
A variation on the access.log line pattern, as it appears in dansguardian's sourcecode.
Une variation sur le format usuel, telle qu'elle apparaît dans le code source de dansguardian.
WHEN\tWHO\tFROM\tWHERE\tWHAT\tWHY\tHOW\tSIZE\tNAUGHTINESS\t(?:CATEGORY)?\tFILTERGROUPNUMBER\tRETURNCODE\t(?:MIMETYPE)?\t(?:CLIENTNAME)?\t(?:FILTERGROUPNAME)?\t(?:USERAGENT)?
the log's date
la date du log
WHEN
decode_DGDate
a user or a computer, if an "authplugin" has identified it
un nom d'utilisateur ou d'équipement, s'il a été identifié par un "authplugin"
WHO
the IP address of the requestor
l'adresse IP d'origine de la requête
FROM
the complete requested URL
l'URL de la requête
WHERE
the list of actions taken by dansguardian, as it appears in the log file. This list is refined in relevant tags such as "action", "scan_result", "url_modified", "content_modified" and "header_modified" when applicable.
la liste des actions prises par dansguardian, telle qu'elle apparaît dans le fichier de log. Cette liste sert à définir d'autres tags : "action", "scan_result", "url_modified", "content_modified" et "header_modified" quand cela est pertinent.
WHAT
decode_DGactions
why the actions were taken
la raison pour laquelle les actions ont été appliquées
WHY
the HTTP request verb
la méthode HTTP
HOW
the size in bytes of the document, if fetched
la taille du document en bytes, si il a été récupéré
SIZE
the sum of all the weighted phrase scores
le score total d'inadéquation
NAUGHTINESS
the contents of the #listcategory tag, if any, in the list that is most relevant to the action
le contenu éventuel de la métadonnée #listcategory la plus pertinente par rapport à l'action
CATEGORY
the filter group the request was assigned to
le groupe de filtrage auquel la requête a été assignée
FILTERGROUPNUMBER
the HTTP return code
le code HTTP de la réponse
RETURNCODE
the MIME type, if relevant, of the returned document
le type MIME, si applicable, de la réponse
MIMETYPE
if configured, the result of a reverse DNS IP lookup on the requestor's IP address
si activée, la résolution DNS inversée de l'IP d'origine de la requête
CLIENTNAME
the name of the filter group
le nom du groupe de filtrage
FILTERGROUPNAME
the browser's user agent string
la valeur du champ "user agent" exposée par le navigateur
USERAGENT
A CSV version on the access.log line pattern, as it appears in dansguardian's sourcecode.
Une version CSV de la ligne de log, telle qu'elle apparaît dans le code source de dansguardian.
"WHEN","WHO","FROM","WHERE","WHAT","WHY","HOW","SIZE","NAUGHTINESS","(?:CATEGORY)?","FILTERGROUPNUMBER","RETURNCODE","(?:MIMETYPE)?","(?:CLIENTNAME)?","(?:FILTERGROUPNAME)?","(?:USERAGENT)?"
the log's date
la date du log
WHEN
decode_DGDate
a user or a computer, if an "authplugin" has identified it
un nom d'utilisateur ou d'équipement, s'il a été identifié par un "authplugin"
WHO
the IP address of the requestor
l'adresse IP d'origine de la requête
FROM
the complete requested URL
l'URL de la requête
WHERE
the list of actions taken by dansguardian, as it appears in the log file. This list is refined in relevant tags such as "action", "scan_result", "url_modified", "content_modified" and "header_modified" when applicable.
la liste des actions prises par dansguardian, telle qu'elle apparaît dans le fichier de log. Cette liste sert à définir d'autres tags : "action", "scan_result", "url_modified", "content_modified" et "header_modified" quand cela est pertinent.
WHAT
decode_DGactions
why the actions were taken
la raison pour laquelle les actions ont été appliquées
WHY
the HTTP request verb
la méthode HTTP
HOW
the size in bytes of the document, if fetched
la taille du document en bytes, si il a été récupéré
SIZE
the sum of all the weighted phrase scores
le score total d'inadéquation
NAUGHTINESS
the contents of the #listcategory tag, if any, in the list that is most relevant to the action
le contenu éventuel de la métadonnée #listcategory la plus pertinente par rapport à l'action
CATEGORY
the filter group the request was assigned to
le groupe de filtrage auquel la requête a été assignée
FILTERGROUPNUMBER
the HTTP return code
le code HTTP de la réponse
RETURNCODE
the MIME type, if relevant, of the returned document
le type MIME, si applicable, de la réponse
MIMETYPE
if configured, the result of a reverse DNS IP lookup on the requestor's IP address
si activée, la résolution DNS inversée de l'IP d'origine de la requête
CLIENTNAME
the name of the filter group
le nom du groupe de filtrage
FILTERGROUPNAME
the browser's user agent string
la valeur du champ "user agent" exposée par le navigateur
USERAGENT
dansguardian
pylogsparser-0.4/normalizers/named-2.xml 0000644 0001750 0001750 00000056054 11705765631 016544 0 ustar fbo fbo
fbo@wallix.com
\S+
\S+
\S+
\d+-\w+-\d{4} \d+:\d+:\d+\.\d+
default:|general:|database:|security:|config:|resolver:|xfer-in:|xfer-out:|notify:|client:|unmatched:|network:|update:|update-security:|queries:|dispatch:|dnssec:|lame-servers:|edns-disabled:
emerg:|alert:|crit:|error:|warn:|notice:|info:|debug:
log['category'] = value.rstrip(':')
# define severities
SEVERITIES = [ "emerg",
"alert",
"crit",
"error",
"warn",
"notice",
"info",
"debug" ]
severity = value.rstrip(':')
log["severity"] = severity
log["severity_code"] = SEVERITIES.index(severity)
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: transfer of 'ZONE/CLASS': TYPE ACTION$
Client IP address related to this request
Adresse IP du client ayant généré la requête
IP
UDP client port
Port UDP du client
PORT
DNS Zone related to this request
Zone DNS concernée par la requête
ZONE
Action prise par le serveur
Action taken by server
ACTION
Requested DNS Class (CLASS)
Classe DNS de la requête
CLASS
Requested DNS recording Type (TYPE)
Type (TYPE) d'enregistrement DNS demandé
TYPE
DATE
dd-MMM-YYYY hh:mm:ss
Subsystem category
Catégorie de sous-système
CATEGORY
decode_named_category
Message severity
Sévérité du message
SEVERITY
decode_named_severity
zone_transfer
named
client 10.10.4.4#35129: transfer of 'qa.ifr.lan/IN': AXFR started
10.10.4.4
zone_transfer
qa.ifr.lan
IN
started
AXFR
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?transfer of 'ZONE/CLASS' from IP#PORT: ACTION of transfer
IP
PORT
ZONE
ACTION
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_transfer
named
transfer of 'localdomain/IN' from 192.168.1.3#53: end of transfer
192.168.1.3
zone_transfer
localdomain
IN
end
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?transfer of 'ZONE/CLASS' from IP#PORT: failed while receiving responses: REFUSED
IP
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_transfer_failure
named
failed
transfer of 'ns2.domain.de/IN' from 192.168.0.5#53: failed while receiving responses: REFUSED
192.168.0.5
zone_transfer_failure
ns2.domain.de
IN
failed
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?lame server resolving 'DOMAIN' \(in 'ZONE'\?\): IP#PORT
IP
PORT
ZONE
DOMAIN
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
lame_server_report
named
lame server resolving 'www.cadenhead.org' (in 'cadenhead.org'?): 67.19.3.218#53
67.19.3.218
lame_server_report
cadenhead.org
www.cadenhead.org
named
name resolution
pylogsparser-0.4/normalizers/s3.xml 0000644 0001750 0001750 00000030446 11715703401 015627 0 ustar fbo fbo
S3 log normalization.
S3 logs consist of a list of values.
Normalized keys are "bucket_owner", "bucket", "date", "ip", "requestor", "requestid", "operation", "key", "http_method", "http_target", "http_proto", "http_sta\
tus", "s3err", "sent", "object_size", "total_request_time", "turn_around_time", "referer", "user_agent" and "version_id".
Ce normaliseur analyse les logs émis par S3.
Les messages S3 consistent en une liste de valeurs.
Les clés extraites par ce normaliseur sont "bucket_owner", "bucket", "date", "ip", "requestor", "requestid", "operation", "key", "http_method", "http_target", "http_proto", "http_status", "s3err", "sent", "object_size", "total_request_time", "turn_around_time", "referer", "user_agent" et "version_id".
olivier.hervieu@tinyclues.com
Matches S3 common time format.
Une expression correspondant au format d'horodatage des logs S3.
\[\d{1,2}/.{3}/\d{4}:\d{1,2}:\d{1,2}:\d{1,2}(?: [+-]\d{4})?\]
Matches S3 quoted strings.
Permet de parser les chaines de caracters S3.
\".*\"
\S+
value = value[1:-1].split(' ')
log['http_method'] = value[0]
log['http_target'] = value[1]
log['protocol'] = value[2]
value = value[1:-1]
log['referer'] = value
value = value[1:-1]
log['user_agent'] = value
Generic s3 log pattern.
Parseur générique des logs S3.
OWNER NAME DATE IP REQUESTOR REQUESTID OP KEY HTTP_METHOD HTTP_STATUS S3ERR SENT SIZE TOTAL TAT REF AGENT VID
The canonical user id of the owner of the source bucket
Identifiant canonique du propriétaire du bucket
OWNER
the bucket name
le nom du bucket
NAME
the time at which the request was issued. Please note that the timezone information is not carried over
la date à laquelle la requête a été émise. L'information de fuseau horaire n'est pas prise en compte
DATE
dd/MMM/YYYY:hh:mm:ss
The apparent Internet address of the requester.
Adresse IP apparente de la requête.
IP
The canonical user id of the requester.
Identifiant canonique du requeteur.
REQUESTOR
request id
id de la requête
REQUESTID
operation type
type de l'opération
OP
The "key" part of the request, URL encoded, or "-" if the operation does not take a key parameter.
KEY
The Request-URI part of the HTTP request message.
HTTP_METHOD
split_s3_info
The numeric HTTP status code of the response.
Code numérique de retour de la requête HTTP.
HTTP_STATUS
The Amazon S3 Error Code, or "-" if no error occurred.
Code d'erreur S3 ou "-".
S3ERR
The number of response bytes sent, excluding HTTP protocol overhead, or "-" if zero.
SENT
The total size of the object in question.
SIZE
The number of milliseconds the request was in flight from the server's perspective.
TOTAL
The number of milliseconds that Amazon S3 spent processing your request.
TAT
The value of the HTTP Referrer header, if present
REF
refer_unquote
The value of the HTTP User-Agent header.
AGENT
agent_unquote
The version ID in the request, or "-" if the operation does not take a versionId parameter.
VID
s3
DEADBEEF testbucket [19/Jul/2011:13:17:11 +0000] 10.194.22.16 FACEDEAD CAFEDECA REST.GET.ACL - "GET /?acl HTTP/1.1" 200 - 951 - 397 - "-" "Jakarta Commons-HttpClient/3.0" -
DEADBEEF
testbucket
10.194.22.16
FACEDEAD
CAFEDECA
REST.GET.ACL
-
GET
/?acl
HTTP/1.1
200
-
951
-
397
-
-
Jakarta Commons-HttpClient/3.0
-
pylogsparser-0.4/normalizers/normalizer.template 0000644 0001750 0001750 00000007147 11705765631 020515 0 ustar fbo fbo
pylogsparser-0.4/normalizers/named.xml 0000644 0001750 0001750 00000217666 11705765631 016415 0 ustar fbo fbo
fbo@wallix.com
\S+
\S+
\S+
\d+-\w+-\d{4} \d+:\d+:\d+\.\d+
default:|general:|database:|security:|config:|resolver:|xfer-in:|xfer-out:|notify:|client:|unmatched:|network:|update:|update-security:|queries:|dispatch:|dnssec:|lame-servers:|edns-disabled:
emerg:|alert:|crit:|error:|warn:|notice:|info:|debug:
view \S+:
log['category'] = value.rstrip(':')
# define severities
SEVERITIES = [ "emerg",
"alert",
"crit",
"error",
"warn",
"notice",
"info",
"debug" ]
severity = value.rstrip(':')
log["severity"] = severity
log["severity_code"] = SEVERITIES.index(severity)
view = value.rstrip(':').split()[-1]
log["view"] = view
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: (?:VIEW )?query: DOMAIN CLASS TYPE \S+$
Client IP address related to this request
Adresse IP du client ayant généré la requête
IP
UDP client port
Port UDP du client
PORT
Domain requested by client
Domaine concerné par la requête du client
DOMAIN
Requested DNS Class (CLASS)
Classe DNS de la requête
CLASS
Requested DNS recording Type (TYPE)
Type (TYPE) d'enregistrement DNS demandé
TYPE
DATE
dd-MMM-YYYY hh:mm:ss
Subsystem category
Catégorie de sous-système
CATEGORY
decode_named_category
Message severity
Sévérité du message
SEVERITY
decode_named_severity
DNS view related to this request
Vue DNS associée à cette requête
VIEW
decode_named_view
client_query
named
client 10.10.4.4#39583: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
name resolution
client 10.10.4.4#39583: view external: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
external
name resolution
28-Feb-2000 15:05:32.863 client 10.10.4.4#39583: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
2000-02-28 15:05:32.863000
name resolution
28-Feb-2000 15:05:32.863 general: client 10.10.4.4#39583: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
2000-02-28 15:05:32.863000
general
name resolution
28-Feb-2000 15:05:32.863 general: client 10.10.4.4#39583: view external: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
2000-02-28 15:05:32.863000
general
external
name resolution
queries: client 10.10.4.4#39583: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
queries
name resolution
28-Feb-2000 15:05:32.863 general: crit: client 10.10.4.4#39583: query: tpf.qa.ifr.lan IN SOA +
client_query
tpf.qa.ifr.lan
10.10.4.4
39583
SOA
IN
named
2000-02-28 15:05:32.863000
general
crit
2
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: query 'DOMAIN/TYPE/CLASS' denied$
IP
PORT
DOMAIN
CLASS
TYPE
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
client_query_denied
denied
named
client 127.0.0.1#44063: query 'www.example.com/A/IN' denied
client_query_denied
www.example.com
127.0.0.1
44063
A
IN
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: query denied$
IP
PORT
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
client_query_denied
denied
named
client 127.0.0.1#1126: query denied
client_query_denied
127.0.0.1
1126
denied
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: query \(cache\) denied$
IP
PORT
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
client_query_denied
denied
named
client 127.0.0.1#1126: query (cache) denied
client_query_denied
127.0.0.1
1126
denied
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: query \(cache\) 'DOMAIN/TYPE/CLASS' denied$
IP
PORT
DOMAIN
CLASS
TYPE
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
client_query_denied
denied
named
client 219.135.228.103#17635: query (cache) 'mycompany.com.cn/MX/IN' denied
client_query_denied
mycompany.com.cn
219.135.228.103
17635
MX
IN
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?createfetch: DOMAIN TYPE$
DOMAIN
TYPE
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
fetch_request
named
createfetch: 126.92.194.77.zen.spamhaus.org A
fetch_request
126.92.194.77.zen.spamhaus.org
A
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?zone ZONE/CLASS: transferred serial SERIAL$
DNS Zone related to this request
Zone DNS concernée par la requête
ZONE
CLASS
Transaction serial number
Numéro de série de la transaction
SERIAL
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_transfer
named
zone localdomain/IN: transferred serial 2006070304
zone_transfer
localdomain
IN
2006070304
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: zone transfer 'ZONE/CLASS' denied$
IP
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_transfer_denied
denied
named
client 219.135.228.103#17635: zone transfer 'somedomain.com/IN' denied
219.135.228.103
17635
zone_transfer_denied
somedomain.com
IN
named
denied
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: bad zone transfer request: 'ZONE/CLASS': .*
IP
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_transfer_bad
named
client 192.168.198.130#4532: bad zone transfer request: 'www.abc.com/IN': non-authoritative zone (NOTAUTH)
192.168.198.130
4532
zone_transfer_bad
www.abc.com
IN
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?zone ZONE/CLASS: refresh: failure trying master IP#PORT: timed out$
DNS master IP address
Adresse IP du master DNS
IP
DNS master PORT
Port du master DNS
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_transfer_timeout
named
failure
zone example.com/IN: refresh: failure trying master 1.2.3.4#53: timed out
1.2.3.4
53
zone_transfer_timeout
example.com
IN
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?zone ZONE/CLASS: refresh: retry limit for master IP#PORT exceeded$
IP
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_refresh_limit
named
retry
zone somedomain.com.au/IN: refresh: retry limit for master 1.2.3.4#53 exceeded
1.2.3.4
53
zone_refresh_limit
somedomain.com.au
IN
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: updating zone 'ZONE/CLASS': ACTION an \S+ at 'DOMAIN' TYPE$
IP
PORT
ZONE
CLASS
TYPE
Action prise par le serveur
Action taken by server
ACTION
DOMAIN
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_update
named
client 127.0.0.1#32839: updating zone 'home.whootis.com/IN': adding an RR at 'pianogirl.home.whootis.com' TXT
127.0.0.1
zone_update
home.whootis.com
IN
TXT
adding
pianogirl.home.whootis.com
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: updating zone 'ZONE/CLASS': update failed: .*
IP
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_update_failure
named
failed
client 10.10.1.8#53147: updating zone 'clima-tech.com/IN': update failed: rejected by secure update (REFUSED)
10.10.1.8
zone_update_failure
clima-tech.com
IN
failed
named
name resolution
(?:DATE )?(?:CATEGORY )?(?:SEVERITY )?client IP#PORT: update 'ZONE/CLASS' denied
IP
PORT
ZONE
CLASS
DATE
dd-MMM-YYYY hh:mm:ss
CATEGORY
decode_named_category
SEVERITY
decode_named_severity
zone_update_failure
named
denied
client 10.10.1.8#53147: update 'clima-tech.com/IN' denied
10.10.1.8
zone_update_failure
clima-tech.com
IN
denied
named
name resolution
pylogsparser-0.4/normalizers/common_callBacks.xml 0000644 0001750 0001750 00000025012 11710220376 020522 0 ustar fbo fbo
]>
dd matches the number of the day (1, 2, 3, etc...)
MM matches the name of the month (Jan, Feb, Mar, etc...)
YYYY matches the year (2012)
hh:mm:ss matches the time (23:54:42)
r = re.compile('(?P<month>\d{2})/(?P<day>\d{2})/(?P<year>\d{4}) (?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})')
date = r.search(value).groupdict()
date = dict([(u, int(date[u])) for u in date.keys()])
newdate = datetime(**date)
log['date'] = newdate
dd matches the number of the day (1, 2, 3, etc...)
MMM matches the name of the month (Jan, Feb, Mar, etc...)
YYYY matches the year (2012)
hh:mm:ss matches the time (23:54:42)
english_months = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6, 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
ctf = re.compile("(?P<day>\d+)/(?P<month>[a-zA-Z]+)/(?P<year>\d+):(?P<hour>\d+):(?P<minute>\d+):(?P<second>\d+)")
m = ctf.search(value)
if m:
vals = m.groupdict()
vals['month'] = english_months[vals['month']]
vals = dict( [ (u, int(vals[u])) for u in vals.keys() ])
newdate = datetime(**vals)
log['date'] = newdate
else:
raise Exception, "invalid date string %s" % value
MMM matches the name of the month (Jan, Feb, Mar, etc...)
dd matches the number of the day (1, 2, 3, etc...)
hh:mm:ss matches the time (23:54:42)
MONTHS = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
now = datetime.now()
currentyear = now.year
# Following line may throw a lot of ValueError
newdate = datetime(currentyear,
MONTHS.index(value[0:3]) + 1,
int(value[4:6]),
int(value[7:9]),
int(value[10:12]),
int(value[13:15]))
if newdate > datetime.today():
newdate = newdate.replace(year = newdate.year - 1)
log['date'] = newdate
MMM matches the name of the month (Jan, Feb, Mar, etc...)
dd matches the number of the day (1, 2, 3, etc...)
hh:mm:ss matches the time (23:54:42)
YYYY matches the year (2012)
MONTHS = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
# Following line may throw a lot of ValueError
newdate = datetime(int(value[7:11]),
MONTHS.index(value[0:3]) + 1,
int(value[4:6]),
int(value[12:14]),
int(value[15:17]),
int(value[18:20]))
log['date'] = newdate
DDD matches the name of the day (Mon, Tue, Wed, etc...)
MMM matches the name of the month (Jan, Feb, Mar, etc...)
dd matches the number of the day (1, 2, 3, etc...)
hh:mm:ss matches the time (23:54:42)
YYYY matches the year (2012)
reg = re.compile(u'(?P<month>[A-Z]{1}[a-z]{2}) (?P<day>\d{1,2}) (?P<hours>\d{2}):(?P<minutes>\d{2}):(?P<seconds>\d{2}) (?P<year>\d{4})')
month = {'Jan' : 1,
'Feb' : 2,
'Mar' : 3,
'Apr' : 4,
'May' : 5,
'Jun' : 6,
'Jul' : 7,
'Aug' : 8,
'Sep' : 9,
'Oct' : 10,
'Nov' : 11,
'Dec' : 12}
date = reg.search(value).groupdict()
year = int(date.get('year'))
month = month.get(date.get('month', None), None)
day = int(date.get('day'))
hours = int(date.get('hours'))
minutes = int(date.get('minutes'))
seconds = int(date.get('seconds'))
newdate = datetime(year, month, day, hours, minutes, seconds)
log['date'] = newdate
YYYY matches the year (2012)
MM matches the number of the month (01, 02, 03 etc...)
DD matches the number of the day (01, 02, 03, etc...)
hh:mm:ss matches the time (23:54:42)
reg = re.compile('(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2}) (?P<hours>\d{2}):(?P<minutes>\d{2}):(?P<seconds>\d{2})')
date = reg.search(value).groupdict()
year= int(date.get('year'))
month = int(date.get('month'))
day = int(date.get('day'))
hours = int(date.get('hours'))
minutes = int(date.get('minutes'))
seconds = int(date.get('seconds'))
newdate = datetime(year, month, day, hours, minutes, seconds)
log['date'] = newdate
MM matches the number of the month (01, 02, 03 etc...)
DD matches the number of the day (01, 02, 03, etc...)
YY matches the year (12)
hh:mm:ss matches the time (23:54:42)
The year is set arbitrarily in the XXIst century.
reg = re.compile('(?P<month>\d{2})/(?P<day>\d{2})/(?P<year>\d{2}), (?P<hours>\d{1,2}):(?P<minutes>\d{2}):(?P<seconds>\d{2})')
date = reg.search(value)
date = date.groupdict()
year= int(date.get('year'))
month = int(date.get('month'))
day = int(date.get('day'))
hours = int(date.get('hours'))
minutes = int(date.get('minutes'))
seconds = int(date.get('seconds'))
newdate = datetime(2000 + year, month, day, hours, minutes, seconds)
log['date'] = newdate
YY matches the year (12)
MM matches the number of the month (01, 02, 03 etc...)
DD matches the number of the day (01, 02, 03, etc...)
hh:mm:ss matches the time (23:54:42)
reg = re.compile('(?P<year>[0-9]{2})(?P<month>[0-9]{2})(?P<day>[0-9]{2}) (?P<hours>(?:[0-9]{2}| [0-9])):(?P<minutes>[0-9]{2}):(?P<seconds>[0-9]{2})')
date = reg.search(value)
date = date.groupdict()
year= int(date.get('year'))
month = int(date.get('month'))
day = int(date.get('day'))
hours = int(date.get('hours'))
minutes = int(date.get('minutes'))
seconds = int(date.get('seconds'))
newdate = datetime(2000 + year, month, day, hours, minutes, seconds)
log["date"] = newdate
Converts a combined date and time in UTC expressed according to the ISO 8601
standard. Also commonly referred to as "Zulu Time".
Precision can be up to the millisecond.
r = re.compile("""
(?P<year>\d{4})-
(?P<month>\d{2})-
(?P<day>\d{2})
T(?P<hour>\d{2}):
(?P<minute>\d{2}):
(?:(?P<second>\d{2})
(?:\.(?P<microsecond>\d{3}))?)?Z""", re.VERBOSE)
m = r.match(value).groupdict()
m = dict( [ (u, v and int(v) or 0) for u,v in m.items() ] )
m['microsecond'] = m['microsecond'] * 1000
log['date'] = datetime(**m)
Converts an EPOCH timestamp to a human-readable date.
log['date'] = datetime.utcfromtimestamp(float(value))
Converts a date as in 28-Feb-2010 23:15:54 . This format is used in BIND9 logs among others.
Precision can be up to the millisecond.
MONTHS = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
r = re.compile('(?P<day>\d+)-(?P<month>\w+)-(?P<year>\d{4}) (?P<hour>\d+):(?P<minute>\d+):(?P<second>\d+)(?:\.(?P<microsecond>\d+))?')
m = r.match(value).groupdict()
m['month'] = MONTHS.index(m['month']) + 1
m = dict( [ (u, v and int(v) or 0) for u,v in m.items() ] )
m['microsecond'] = m['microsecond'] * 1000
log['date'] = datetime(**m)
pylogsparser-0.4/normalizers/normalizer.dtd 0000644 0001750 0001750 00000017066 11705765631 017456 0 ustar fbo fbo
pylogsparser-0.4/normalizers/LEA.xml 0000644 0001750 0001750 00000020243 11705765631 015711 0 ustar fbo fbo
This normalizer handles LEA (Log Export API) normalization. The LEA format is used by CheckPoint products to export logs to a LogBox.
The formatting with | as a fields separator is due to the use of FW1-LogGrabber for log fetching.
Due to the dynamic nature of this logging format, please refer to your product's documentation to find out more about tagging.
Ce normaliseur analyse les logs émis en utilisant l'API d'export de logs (LEA). Cette API peut être utilisée pour la réception de logs en provenance d'équipements CheckPoint.
Le formatage des champs séparés par le caractère | est dû à la récupération des logs via l'utilitaire FW1-LogGrabber.
En raison de la nature dynamique de ce format de log, les tags extraits peuvent varier en fonction des événements consignés. Veuillez vous référer à la documentation de votre équipement exposant LEA pour de plus amples informations.
mhu@wallix.com
LEA fields as "key=value", separated by |
Champs descriptifs au format "clé=valeur", séparés par le caractère |
(?:[^ =]+=[^|]+|)*[^ =]+=[^|]+
# These are the only tags we extract
KNOWN = [ ("loc", "id"),
"product",
"i/f_dir",
"i/f_name",
"orig",
"type",
"action",
("proto", "protocol"),
"rule",
"src",
"dst",
("s_port", "source_port"),
("service", "dest_port"),
("uuid", "lea_uuid") ]
def src_dst_extract(data):
ip_re = re.compile("(?<![.0-9])((?:[0-9]{1,3}[.]){3}[0-9]{1,3})(?![.0-9])")
if ip_re.match(data['src']):
data['source_ip'] = data['src']
else:
data['source_host'] = data['src']
if ip_re.match(data['dst']):
data['dest_ip'] = data['dst']
else:
data['dest_host'] = data['dst']
if ip_re.match(data['orig']):
data['local_ip'] = data['orig']
else:
data['local_host'] = data['orig']
del data['src']
del data['dst']
del data['orig']
def int_extract(data):
if 'i/f_dir' in data.keys():
if data['i/f_dir'] == 'inbound':
data['inbound_int'] = data['i/f_name']
if data['i/f_dir'] == 'outbound':
data['outbound_int'] = data['i/f_name']
del data['i/f_dir']
del data['i/f_name']
dic = {}
body = value.split('|')
for l in body:
key, val = l.split("=", 1)
dic[key] = val
# keep only known tags
for t in KNOWN:
if isinstance(t, basestring):
t = (t,t)
old, new = t
if old in dic.keys():
log[new] = dic[old]
# improve body readability
log['body'] = log['body'].replace("|", " ")
# Try to retrieve the date
try:
log['date'] = datetime.utcfromtimestamp(int(dic['time']))
except:
try:
log['date'] = datetime.strptime(dic['time'], "%Y-%m-%d %H:%M:%S")
except:
# cannot parse it, keep it safe
log['time'] = dic['time']
src_dst_extract(log)
int_extract(log)
lea
A list of key-value couples, separated by a | character.
L'événement est décrit à l'aide d'une série de couples clé-valeur, séparés par le caractère |.
LEAFIELDS
a list of key-value couples, separated by a | character, needing some post-processing
la liste des couples clé-valeur, à passer à une fonction de post-traitement
LEAFIELDS
decode_LEA
loc=3707|time=1199716450|action=accept|orig=fw1|i/f_dir=inbound|i/f_name=PCnet1|has_accounting=0|uuid=<47822e42,00000001,7b040a0a,000007b6>|product=VPN-1 & FireWall-1|__policy_id_tag=product=VPN-1 & FireWall-1[db_tag={9F95C344-FE3F-4E3E-ACD8-60B5194BAAB4};mgmt=fw1;date=1199701916;policy_name=Standard]|src=naruto|s_port=56840|dst=fw1|service=https|proto=tcp|rule=1
3707
accept
VPN-1 & FireWall-1
PCnet1
fw1
tcp
1
naruto
fw1
56840
https
firewall
pylogsparser-0.4/normalizers/deny_event.xml 0000644 0001750 0001750 00000062055 11705765631 017457 0 ustar fbo fbo
clo@wallix.com
[-0-9a-z]*
\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}[.]\d+
[^,]*
Ugly hack for IPv6, IPv4 addresses
(?:((([0-9A-Fa-f]{1,4}:){7}(([0-9A-Fa-f]{1,4})|:))|(([0-9A-Fa-f]{1,4}:){6}(:|((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})|(:[0-9A-Fa-f]{1,4})))|(([0-9A-Fa-f]{1,4}:){5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){4}(:[0-9A-Fa-f]{1,4}){0,1}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){3}(:[0-9A-Fa-f]{1,4}){0,2}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:){2}(:[0-9A-Fa-f]{1,4}){0,3}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|(([0-9A-Fa-f]{1,4}:)(:[0-9A-Fa-f]{1,4}){0,4}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|(:(:[0-9A-Fa-f]{1,4}){0,5}((:((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})?)|((:[0-9A-Fa-f]{1,4}){1,2})))|(((25[0-5]|2[0-4]\d|[01]?\d{1,2})(\.(25[0-5]|2[0-4]\d|[01]?\d{1,2})){3})))(%.+)?)?
TYPES = {
"1" : "Resources",
"2" : "System",
"3" : "Configuration",
"4" : "Security",
"5" : "Backend",
"6" : "Acceleration"
}
log['alert_type'] = TYPES.get(value, "Unknown")
SUBTYPES = {
"1.1" : "CPU",
"1.2" : "Memory",
"2.1" : "Access",
"2.2" : "Device Operations",
"3.1" : "Configuration change",
"3.2" : "Backup and Restore",
"4.1" : "HTTP Security",
"4.2" : "XML Security",
"4.3" : "Authentication",
"5.1" : "Backend availability",
"5.2" : "Backend performances",
"6.1" : "Server Load-Balancing",
"6.2" : "Caching"
}
MESSAGES = {
# Resources
"1.1.6.0" : "CPU utilization below 60%",
"1.1.4.0" : "CPU utilization over 60%",
"1.1.2.0" : "CPU utilization over 80%",
"1.2.6.0" : "Memory utilization below 70%",
"1.2.4.0" : "Memory utilization over 70%",
"1.2.2.0" : "Memory utilization over 90%",
# System
"2.1.6.0" : "User logout",
"2.1.4.0" : "User successful login",
"2.1.2.0" : "User failed login attempts",
"2.2.6.0" : "Instance started",
"2.2.5.0" : "rWeb started",
"2.2.4.0" : "Instance stopped",
"2.2.2.0" : "rWeb stopped",
# Configuration
"3.1.5.0" : "Configuration change successful",
"3.1.3.0" : "Configuration change failed",
"3.2.5.0" : "Configuration backup successful",
"3.2.5.1" : "Configuration restore successful",
"3.2.3.0" : "Configuration backup failure",
"3.2.2.0" : "Configuration restore failure",
# Security
"4.1.4.0" : "Attack blocked by Blacklist",
"4.1.4.1" : "Attack blocked by Whitelist",
"4.1.4.2" : "Attack blocked by Scoringlist",
"4.1.4.3" : "Attack blocked by UBT (DoS protection)",
"4.1.4.4" : "Attack blocked by UBT (Site Crawling)",
"4.1.4.5" : "Attack blocked by UBT (Brute Force)",
"4.1.4.6" : "Attack blocked by UBT (Cookie Theft)",
"4.1.4.7" : "Attack blocked by UBT (Direct Access)",
"4.1.4.8" : "Attack blocked by UBT (Restricted Access)",
"4.1.4.9" : "Attack blocked by Stateful engine (Link Tracking)",
"4.1.4.10" : "Attack blocked by Stateful engine (Parameter Tracking)",
"4.1.4.11" : "Attack blocked by Stateful engine (Cookie Tracking)",
"4.1.4.12" : "Attack blocked by Canonization engine (URI Wrong Encoding)",
"4.1.4.13" : "Attack blocked by Canonization engine (URI Decoding)",
"4.1.4.14" : "Attack blocked by Canonization engine (Parameter Decoding)",
"4.1.4.15" : "Attack blocked by HTTP requests filter (Forbidden Method)",
"4.1.4.16" : "Attack blocked by HTTP requests filter (Header Size)",
"4.1.4.17" : "Attack blocked by HTTP requests filter (Body Size)",
"4.1.4.18" : "Attack blocked by HTTP requests filter (Number of Request Fields)",
"4.1.4.19" : "Attack blocked by HTTP requests filter (Size of Request Fields)",
"4.1.4.20" : "Attack blocked by HTTP requests filter (Number of Request Lines)",
"4.1.4.21" : "Attack blocked by HTTP responses filter",
"4.2.4.0" : "Attack blocked by Blacklist",
"4.2.4.1" : "Attack blocked by Scoringlist",
"4.2.4.2" : "Attack blocked by XML Schema validation engine",
"4.2.4.3" : "Attack blocked by Stateful engine",
"4.2.4.4" : "Attack blocked by Canonization engine",
"4.2.4.5" : "Attack blocked by Attachment validation engine",
"4.2.4.6" : "Attack blocked by Source filtering engine",
"4.3.5.0" : "Authentication successful",
"4.3.3.1" : "Authentication failed",
# Backend
"5.1.6.0" : "Server available",
"5.1.1.0" : "Server error response",
"5.1.0.0" : "Server not response",
"5.2.6.0" : "Response time < 70% of maximum allowed",
"5.2.4.0" : "Response time 70% of maximum allowed",
"5.2.2.0" : "Response time 90% of maximum allowed",
# Acceleration
"6.1.6.0" : "Server back in farm",
"6.1.2.0" : "Server down, removed from farm",
"6.1.0.0" : "All servers down",
"6.2.6.0" : "Cache utilization < 70%",
"6.2.4.0" : "Cache 70% full",
"6.2.2.0" : "Cache 90% full",
"6.2.1.0" : "Cache 100% full, increase cache size",
}
event_subtype = log['alert_type_id'] + "." + log['alert_subtype_id']
log['alert_subtype'] = SUBTYPES.get(event_subtype, "Unknown")
event_id = event_subtype + "." + log['severity_code'] + "." + log['alert_id']
log['event'] = MESSAGES.get(event_id, "Unknown")
SEVERITIES=["Emerg", "Alert", "Crit", "Error", "Warn", "Notice", "Info", "Debug"]
try:
log['severity'] = SEVERITIES[int(value)]
except:
# no big deal if we don't get this one
pass
EVENT_UID,START_DATE,END_DATE,ACKDATE,ACKUSER,IP_DEVICE,IP_SOURCE,TARGET_IP,ALERT_TYPE_ID,ALERT_SUBTYPE_ID,SEVERITY,ALERT_ID,ALERT_VALUE,USER,INTERFACE,OBJECT_NAME,PARAMETER_CHANGED,PREVIOUS_VALUE,NEW_VALUE,UNKNOWN1,UNKNOWN2,UNKNOWN3,UUID_BLACKLIST,UUID_POLICY,UUID_APP,ACTION,HTTP_METHOD_USED,URL,PARAMETERS,URI,ATTACK_ID,ATTACK_USER,AUTH_MECHANISM,UNKNOWN4,UNKNOWN5,UNKNOWN6,UNKNOWN7
EVENT_UID
START_DATE
YYYY-MM-DD hh:mm:ss
END_DATE
ACKDATE
ACKUSER
IP_DEVICE
IP_SOURCE
TARGET_IP
ALERT_TYPE_ID
decode_alert_type
ALERT_SUBTYPE_ID
SEVERITY
decode_severity
ALERT_ID
ALERT_VALUE
USER
INTERFACE
OBJECT_NAME
PARAMETER_CHANGED
PREVIOUS_VALUE
NEW_VALUE
UNKNOWN1
UNKNOWN2
UNKNOWN3
UNKNOWN4
UNKNOWN5
UNKNOWN6
UNKNOWN7
UUID_BLACKLIST
UUID_POLICY
UUID_APP
ACTION
HTTP_METHOD_USED
URL
PARAMETERS
URI
ATTACK_ID
ATTACK_USER
AUTH_MECHANISM
228,2011-01-24 18:08:06.957252,2011-01-24 18:08:06.957252,,,192.168.80.10,192.168.80.1,,4,1,4,0,,,,,,,,,,,11111111-1111-1111-1111-111111111111,7ed198ca-26d5-11e0-a46f-000c298895c5,d74ca776-265b-11e0-a54a-000c298895c5,deny,GET,/cgi-bin/badstore.cgi?searchquery=1%27+OR+1%3D1+%23&action=search&x=0&y=0,GET /cgi-bin/badstore.cgi?searchquery=1' OR 1=1 #&action=search&x=0&y=0,(uri) ,11230-0 ,,,,,,
228
192.168.80.10
192.168.80.1
4
1
4
0
Security
HTTP Security
Attack blocked by Blacklist
11111111-1111-1111-1111-111111111111
7ed198ca-26d5-11e0-a46f-000c298895c5
d74ca776-265b-11e0-a54a-000c298895c5
deny
GET
/cgi-bin/badstore.cgi?searchquery=1%27+OR+1%3D1+%23&action=search&x=0&y=0
GET /cgi-bin/badstore.cgi?searchquery=1' OR 1=1 #&action=search&x=0&y=0
(uri)
11230-0
firewall
decode_message
pylogsparser-0.4/normalizers/pam.xml 0000644 0001750 0001750 00000020502 11705765631 016063 0 ustar fbo fbo
This normalizer can parse messages issued by the Pluggable Authentication Module (PAM).
Ce normaliseur analyse les messages émis par le module d'authentification par greffons (PAM).
mhu@wallix.com
the name of the PAM component
le nom du composant PAM
pam_\w+
the user information
l'utilisateur concerné par l'authentification
[^ ]+
the session action
l'action de session
opened|closed
log["action"] = {'opened' : 'open',
'closed' : 'close'}.get(value, value)
This type of message is logged at session opening or closing.
Structure des message émis à l'ouverture ou la fermeture de session.
PAMCOMPONENT\(PROGRAM:TYPE\):.* session ACTION for user USER
the PAM component
le composant PAM
PAMCOMPONENT
the program calling PAM
le programme invoquant l'authentification via PAM
PROGRAM
the authentication type
le type d'authentification
TYPE
the action taken regarding the session
l'action associée à la session
ACTION
decode_action
the user for which an authentication request is issued
l'utilisateur pour lequel la demande d'authentification est émise
USER
pam_unix(cron:session): session opened for user www-data by (uid=0)
cron
pam_unix
session
open
www-data
access control
A generic PAM message.
Structure générique des messages PAM non relatifs à une ouverture ou fermeture de session.
PAMCOMPONENT\(PROGRAM:TYPE\):.* (?:user=USER)?
the PAM component
le composant PAM
PAMCOMPONENT
the program calling PAM
le programme invoquant l'authentification via PAM
PROGRAM
the authentication type
le type d'authentification
TYPE
the user for which an authentication request is issued
l'utilisateur pour lequel la demande d'authentification est émise
USER
pylogsparser-0.4/normalizers/IIS.xml 0000644 0001750 0001750 00000047400 11705765631 015740 0 ustar fbo fbo
This normalizer handles IIS 6.0 (Internet Information Service) logs, which are in w3c ELFF (Extended Log File Format).
Ce normaliseur gère les logs IIS 6.0, qui sont au format w3c ELFF (Extended Log File Format).
clo@wallix.com
Expression matching a w3c ELFF format field which is any non-whitespace character.
Expression correspondant à un champ du format w3c ELFF, correspondant à tous les caractères 'non-espace' (ex.: espace, tabulation, saut de ligne, etc...)
[^\s]+|-
Expression matching a date in the yyyy-mm-dd hh:mm:ss format.
Expression correspondant à une date au format yyyy-mm-dd hh:mm:ss.
[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{1,2}:[0-9]{2}:[0-9]{2}
Expression matching a date in mm/dd/yy format and a time in hh:mm:ss format.
Expression correspondant à une date au format mm/dd/yy et une heure au format hh:mm:ss.
[0-9]{2}/[0-9]{2}/[0-9]{2}, [0-9]{2}:[0-9]{2}:[0-9]{2}
value = float(value)
value = value / 1000
log['time_taken'] = value
This is the default log format in w3c ELFF format.
Log par défaut au format w3c ELFF.
DATE\s+SERVICE\s+SERVER_IP\s+REQUEST_TYPE\s+RESOURCE\s+QUERY\s+PORT\s+USERNAME\s+CLIENT_IP\s+AGENT\s+ACTION_STATUS\s+SUB_STATUS\s+WIN_STATUS
The date that the activity occurred.
La date de l'évènement.
DATE
YYYY-MM-DD hh:mm:ss
The Internet service and instance number that was accessed by a client.
Le service et le numéro de la demande du client.
SERVICE
The IP address of the server on which the log entry was generated.
IP du serveur.
SERVER_IP
The action the client was trying to perform.
Nom de la méthode. Comme GET, PASS, etc..
REQUEST_TYPE
The resource accessed.
La cible de l'opération.
RESOURCE
The query, if any, the client was trying to perform.
La requête que le client a tenté.
QUERY
The port number the client is connected to.
Le port auquel le client est connecté.
PORT
The name of the authenticated user who accessed your server. This does not include anonymous users, who are represented by a hyphen (-).
Le nom de l'utilisateur ayant accédé au serveur.
USERNAME
The IP address of the client that accessed your server.
L'adresse IP du client.
CLIENT_IP
The browser used on the client.
Le navigateur utilisé.
AGENT
The status of the action, in HTTP or FTP terms.
Le status de l'action.
ACTION_STATUS
The substatus of the action.
Le sous-status de l'action.
SUB_STATUS
The status of the action, in terms used by Microsoft Windows®.
Le status de l'action, avec les termes de Windows.
WIN_STATUS
2011-09-26 13:57:48 W3SVC1 127.0.0.1 GET /tapage.asp - 80 - 127.0.0.1 Mozilla/4.0+(compatible;MSIE+6.0;+windows+NT5.2;+SV1;+.NET+CLR+1.1.4322) 404 0 2
W3SVC1
127.0.0.1
GET
/tapage.asp
-
80
-
127.0.0.1
Mozilla/4.0+(compatible;MSIE+6.0;+windows+NT5.2;+SV1;+.NET+CLR+1.1.4322)
404
0
2
web server
This is a log format in w3c ELFF format.
Format de log au format w3c ELFF.
CLIENT_IP,\s*USERNAME,\s*DATE,\s*SERVICE_NAME,\s*SERVER_NAME,\s*SERVER_IP,\s*TIME_TAKEN,\s*CLIENT_BYTES_SENT,\s*SERVER_BYTES_SENT,\s*SERVICE_STATUS_CODE,\s*WINDOWS_STATUS_CODE,\s*REQUEST_TYPE,\s*TARGET_OF_OPERATION,\s*PARAMETERS,\s*
The IP address of the client that accessed your server.
L'adresse du client ayant accédé au serveur.
CLIENT_IP
The name of the authenticated user who accessed your server. This does not include anonymous users, who are represented by a hyphen (-).
Le nom de l'utilisateur ayant accédé au serveur.
USERNAME
The date that the activity occurred.
La date de l'évènement.
DATE
MM/DD/YY, hh:mm:ss
The Internet service and instance number that was accessed by a client.
Le service et le numéro de la demande du client.
SERVICE_NAME
Server's hostname.
Le nom du serveur.
SERVER_NAME
The IP address of the server on which the log entry was generated.
IP du serveur.
SERVER_IP
Elapsed time to complete request.
Temps écoulé pour réaliser la requête.
TIME_TAKEN
convert_time
Number of bytes sent by client. (Request size)
Taille de la requête.
CLIENT_BYTES_SENT
Number of bytes returned by the server.
Nombre d'octets retournés par le serveur.
SERVER_BYTES_SENT
Service status code. (A value of 200 indicates that the request was fulfilled successfully.)
Code de status du service.
SERVICE_STATUS_CODE
Windows status code. (A value of 0 indicates that the request was fulfilled successfully.)
code de status de Windows.
WINDOWS_STATUS_CODE
Method name. Such as GET, PASS, ...
Nom de la méthode. Comme GET, PASS, etc..
REQUEST_TYPE
The target of the operation.
La cible de l'opération.
TARGET_OF_OPERATION
the parameters that are passed to a script if any.
Les paramètres du script s'il y en a un.
PARAMETERS
172.16.255.255, anonymous, 03/20/01, 23:58:11, MSFTPSVC, SALES1, 172.16.255.255, 60, 275, 0, 0, 0, PASS, /Intro.htm, -,
172.16.255.255
anonymous
MSFTPSVC
SALES1
172.16.255.255
275
0
0
0
PASS
/Intro.htm
-
web server
pylogsparser-0.4/normalizers/VMWare_ESX4-ESXi4.xml 0000644 0001750 0001750 00000016031 11705765631 020146 0 ustar fbo fbo
This normalizer parses VMware ESX 4.x and ESXi 4.x logs that are not handled by the Syslog normalizer.
Ce normaliseur analyse les logs de VMware ESX 4.x et ESXi 4.x. qui ne sont pas gérés pas le normaliseur Syslog.
clo@wallix.com
Expression matching a date in the format yyyy-mm-dd hh:mm:ss.
Expression correspondant à une date au format yyyy-mm-dd hh:mm:ss.
\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}
Expression matching a hexadecimal number.
Expression correspondant à un nombre héxadécimal.
[A-F0-9]{8}
Expression matching the 'alpha' field, words between '.
Expression correspondant au champ 'alpha', qui contient les mots entre '.
[^']+(?: [^']+)*
Expression matching the 'level' field.
Expression correspondant au champ 'level'.
[^\s]+
Logs contained in hostd.log file.
Logs contenus dans le fichier hostd.log.
\[DATE NUMERIC LEVEL 'ALPHA'[^\]]*\] BODY
The time at which the request was issued - please note that the timezone information is not carried over.
La date à laquelle la requête a été émise. Veuillez noter que l'information de fuseau horaire n'est pas prise en compte.
DATE
YYYY-MM-DD hh:mm:ss
NUMERIC
The level is the type of the log.
Le level correspond au type du log.
LEVEL
ALPHA
The actual event message.
Le message décrivant l'événement.
BODY
[2011-09-05 17:03:15.220 F6F74B90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 67 in the vmList
F6F74B90
verbose
App
[VpxaMoVm::CheckMoVm] did not find a VM with ID 67 in the vmList
hypervisor
[2011-09-05 17:19:21.741 F63E6900 info 'Vmomi' opID=996867CC-0000030B] Throw vmodl.fault.RequestCanceled
F63E6900
info
Vmomi
Throw vmodl.fault.RequestCanceled
hypervisor
Logs contained in sysboot.log file.
Log contenu dans le fichier sysboot.log.
sysboot: EVENT
The actual event message.
Le message décrivant l'événement.
EVENT
sysboot: Executing 'esxcfg-init --set-boot-progress done'
Executing 'esxcfg-init --set-boot-progress done'
hypervisor
pylogsparser-0.4/INSTALL 0000644 0001750 0001750 00000000357 11627706147 013256 0 ustar fbo fbo Start pylogsparser unittests
---------------------------
The command below will start logsparser test suite.
$ NORMALIZERS_PATH=normalizers/ python tests/test_suite.py
Install pylogsparser
-------------------
# python setup.py install
pylogsparser-0.4/README.rst 0000644 0001750 0001750 00000040350 11715703401 013675 0 ustar fbo fbo LogsParser
==========
Description
:::::::::::
LogsParser is an opensource python library created by Wallix ( http://www.wallix.org ).
It is used as the core mechanism for logs tagging and normalization by Wallix's LogBox
( http://www.wallix.com/index.php/products/wallix-logbox ).
Logs come in a variety of formats. In order to parse many different types of
logs, a developer used to need to write an engine based on a large list of complex
regular expressions. It can become rapidly unreadable and unmaintainable.
By using LogsParser, a developer can free herself from the burden of writing a
log parsing engine, since the module comes in with "batteries included".
Furthermore, this engine relies upon XML definition files that can be loaded at
runtime. The definition files were designed to be easily readable and need very
little skill in programming or regular expressions, without sacrificing
powerfulness or expressiveness.
Purpose
:::::::
The LogsParser module uses normalization definition files in order to tag
log entries. The definition files are written in XML.
The definition files allow anyone with a basic understanding of regular
expressions and knowledge of a specific log format to create and maintain
a customized pool of parsers.
Basically a definition file will consist of a list of log patterns, each
composed of many keywords. A keyword is a placeholder for a notable and/or
variable part in the described log line, and therefore associated to a tag
name. It is paired to a tag type, e.g. a regular expression matching the
expected value to assign to this tag. If the raw value extracted this way needs
further processing, callback functions can be applied to this value.
This format also allows to add useful meta-data about parsed logs, such as
extensive documentation about expected log patterns and log samples.
Format Description
------------------
A normalization definition file must strictly follow the specifications as
they are detailed in the file normalizer.dtd .
A simple template is provided to help parser writers get started with their
task, called normalizer.template.
Most definition files will include the following sections :
* Some generic documentation about the parsed logs : emitting application,
application version, etc ... (non-mandatory)
* the definition file's author(s) (non-mandatory)
* custom tag types (non-mandatory)
* callback functions (non-mandatory)
* Prerequisites on tag values prior to parsing (non-mandatory)
* Log pattern(s) and how they are to be parsed
* Extra tags with a fixed value that should be added once the parsing is done
(non-mandatory)
Root
....
The definition file's root must hold the following elements :
* the normalizer's name.
* the normalizer's version.
* the flags to apply to the compilation of regular expressions associated with
this parser : unicode support, multiple lines support, and ignore case.
* how to match the regular expression : from the beginning of the log line (match)
or from anywhere in the targeted tag (search)
* the tag value to parse (raw, body...)
* the service taxonomy, if relevant, of the normalizer. See the end of this
document for more details.
Default tag types
.................
A few basic tag types are defined in the file common_tagTypes.xml . In order
to use it, it has to be loaded when instantiating the Normalizer class; see the
class documentation for further information.
Here is a list of default tag types shipped with this library.
* Anything : any character chain of any length.
* Integer
* EpochTime : an EPOCH timestamp of arbitrary precision (to the second and below).
* syslogDate : a date as seen in syslog formatted logs (example : Mar 12 20:13:23)
* URL
* MACAddress
* Email
* IP
* ZuluTime : a "Zulu Time"-type timestamp (example : 2012-12-21T13:45:05)
Custom Tag Types
................
It is always possible to define new tag types in a parser definition file, and
to overwrite default ones. To define a new tag type, the following elements are
needed :
* a type name. This will be used as the type reference in log patterns.
* the python type of the expected result : this element is not used yet and can
be safely set to anything.
* a non-mandatory description.
* the regular expression defining this type.
Callback Functions
..................
One might want to transform a raw value after it has been extracted from a pattern:
the syslog normalizer converts the raw log timestamp into a python datetime object,
for example.
In order to do this, the tag must be used to define a callback function.
requires a function name as a mandatory attribute. Its text defines the
function body as in python, meaning the PEP8 indentation rules are to be followed.
When writing a callback function, the following rules must be respected :
* Your callback function will take ONLY two arguments: **value** and **log**.
"value" is the raw value extracted from applying the log pattern to the log,
and "log" is the dictionary of the normalized log in its current state (prior
to normalization induced by this parser definition file).
* Your callback function can modify the "log" argument (especially assign
the transformed value to the concerned tag name) and must not return anything.
* Your callback function has a restricted access to the following facilities: ::
"list", "dict", "tuple", "set", "long", "float", "object",
"bool", "callable", "True", "False", "dir",
"frozenset", "getattr", "hasattr", "abs", "cmp", "complex",
"divmod", "id", "pow", "round", "slice", "vars",
"hash", "hex", "int", "isinstance", "issubclass", "len",
"map", "filter", "max", "min", "oct", "chr", "ord", "range",
"reduce", "repr", "str", "unicode", "basestring", "type", "zip", "xrange", "None",
"Exception"
* Importing modules is therefore forbidden and impossible. The *re* and *datetime*
modules are available for use as if the following lines were present: ::
import re
from datetime import datetime
* In version 0.4, the "extras" package is introduced. It allows more freedom in
what can be used in callbacks. It also increases execution speed in some
cases; typically when you need to use complex objects in your callback like
a big set or a big regular expression. In the old approach, this object
would be created each time the function is called; by deporting the object's
creation in the extras package it is created once and for all. See the modules
in logsparser.extras for use cases.
Default callbacks
.................
As with default tag types, a few generic callbacks are defined in the file
common_callBacks.xml . Currently they are meant to deal with common date
formattings. Therefore they will automatically set the "date" tag. In order to
use it, the callbacks file has to be loaded when instantiating the Normalizer
class; see the class documentation for further information.
In case of name collisions, callbacks defined in a normalizer description file
take precedence over common callbacks.
Here is a list of default callbacks shipped with this library.
* MM/dd/YYYY hh:mm:ss : parses dates such as 04/13/2010 14:23:56
* dd/MMM/YYYY:hh:mm:ss : parses dates such as 19/Jul/2009 12:02:43
* MMM dd hh:mm:ss : parses dates such as Oct 23 10:23:12 . The year is guessed
so that the resulting date is the closest in the past.
* DDD MMM dd hh:mm:ss YYYY : parses dates such as Mon Sep 11 09:13:54 2011
* YYYY-MM-DD hh:mm:ss : parses dates such as 2012-12-21 00:00:00
* MM/DD/YY, hh:mm:ss : parses dates such as 10/23/11, 07:24:04 . The year is
assumed to be in the XXIst century.
* YYMMDD hh:mm:ss: parses dates such as 070811 17:23:12 . The year is assumed
to be in the XXIst century.
* ISO8601 : converts a combined date and time in UTC expressed according to the
ISO 8601 standard. Also commonly referred to as "Zulu Time".
* EPOCH : parses EPOCH timestamps
* dd-MMM-YYYY hh:mm:ss : parses dates such as 28-Feb-2010 23:15:54
Final callbacks
...............
One might want to wait until a pattern has been fully applied before processing
data : if for example you'd like to tag a log with a value made of a concatenation of
other values, and so on. It is possible to specify a list of callbacks to apply
at the end of the parsing with the XML tag "finalCallbacks".
Such callbacks will follow the mechanics described above, with one notable change:
they will be called with the argument "value" set to None. Therefore, you have
to make sure your callback will work correctly that way.
There are a few examples of use available : in the test_normalizer.py test code,
and in the deny_all normalizer.
Pattern definition
..................
A definition file can contain as many log patterns as one sees fit. These patterns
are simplified regular expressions and applied in alphabetical order of their names,
so it is important to name them so that the more precise patterns are tried
before the more generic ones.
A pattern is a "meta regular expression", which means that every syntactic rule from
python's regular expressions are to be followed when writing a pattern, especially
escaping special characters. To make the patterns easier to read than an obtuse
regular expression, keywords act as "macros" and correspond to a part of the log
to assign to a tag.
A log pattern has the following components:
* A name.
* A non-mandatory description of the pattern's context.
* The pattern itself, under the tag "text".
* The tags as they appear in the pattern, the associated name once the normalization
is over, and the callback functions to eventually call on their raw values
* Non-mandatory log samples. These can be used for self-validation.
If a tag name starts with __ (double underscore), this tag won't be added to the
final normalized dictionary. This allows to create temporary tags that will
typically be used in conjunction to a series of callback functions, when the
original raw value has no actual interest.
To define log patterns describing a CSV-formatted message, one must add the
following attributes in the tag "text":
* type="csv"
* separator="," or the relevant separator character
* quotechar='"' or the relevant quotation character
Tags are then defined normally. Pylogsparser will deal automatically with missing
fields.
Best practices
..............
* Order your patterns in decreasing order of specificity. Not doing so might
trigger errors, as more generic patterns will match earlier.
* The more precise your tagTypes' regular expressions, the more accurate your
parser will be.
* Use description tags liberally. The more documented a log format, the better.
Examples are also invaluable.
Tag naming convention
.....................
The tag naming convention is lowercase, underscore separated words. It is strongly
recommended to stick to that naming convention when writing new normalizers
for consistency's sake. In case of dynamic fields, it is advised to make sure
dynamic naming follows the convention. There's an example of this in
MSExchange2007MessageTracking.xml; see the callback named "decode_MTLSourceContext".
Log contains common informations such as username, IP address, informations about
transport protocol... In order to ease log post-processing we must define a common
method to name those tags and not deal for example with a series of "login, user,
username, userid" all describing a user id.
The alphabetical list below is a series of tag names that must be used when relevant.
- action : action taken by a component such as DELETED, migrated, DROP, open.
- bind_int : binding interface for a network service.
- dest_host : hostname or FQDN of a destination host.
- dest_ip : IP address of a destination host.
- dest_mac : MAC address of a destination host.
- dest_port : destination port of a network connection.
- event_id : id describing an event.
- inbound_int : network interface for incoming data.
- len : a data size.
- local_host : hostname or FQDN of the local host.
- local_ip : IP adress of the local host.
- local_mac : MAC address of the local host.
- local_port : listening port of a local service.
- message_id : message or transaction id.
- message_recipient : message recipient id.
- message_sender : message sender id.
- method : component access method such as GET, key_auth.
- outbound_int : network interface for outgoing data.
- protocol : network or software protocol name or numeric id such as TCP, NTP, SMTP.
- source_host : hostname or FQDN of a source host.
- source_ip : IP address of a source host.
- source_mac : MAC address of a source host.
- source_port : source port of a network connection.
- status : component status such as FAIL, success, 404.
see below for a complete list.
- url : an URL as defined in rfc1738. (scheme://netloc/path;parameters?query#fragment)
- user : a user id.
Service taxonomy
................
As of pylogsparser 0.4 a taxonomy tag is added to relevant normalizers. It helps
classifying logs by service type, which can be useful for reporting among other
things. Here is a list of identified services; suggestions and improvements are
welcome !
+-----------+----------------------------------------+------------------------+
| Service | Description | Normalizers |
+===========+========================================+========================+
| access | A service dealing with authentication | Fail2ban |
| control | and/or authorization | pam |
| | | sshd |
| | | wabauth |
+-----------+----------------------------------------+------------------------+
| antivirus | A service dealing with malware | bitdefender |
| | detection and prevention | symantec |
+-----------+----------------------------------------+------------------------+
| database | A database service such as mySQLd, | mysql |
| | postmaster (postGRESQL), ... | |
+-----------+----------------------------------------+------------------------+
| address | A service in charge of network address | dhcpd |
|assignation| assignations | |
+-----------+----------------------------------------+------------------------+
| name | A service in charge of network names | named |
| resolution| resolutions | named-2 |
+-----------+----------------------------------------+------------------------+
| firewall | A service in charge of monitoring | LEA |
| | and filtering network traffic | arkoonFAST360 |
| | | deny_event |
| | | netfilter |
+-----------+----------------------------------------+------------------------+
| file | A file transfer service | xferlog |
| transfer | | |
+-----------+----------------------------------------+------------------------+
| hypervisor| A virtualization platform service | VMWare_ESX4-ESXi4 |
| | | |
+-----------+----------------------------------------+------------------------+
| mail | A mail server | MSExchange2007- |
| | | MessageTracking |
| | | postfix |
+-----------+----------------------------------------+------------------------+
| web proxy | A service acting as an intermediary | dansguardian |
| | between clients and web resources; | deny_traffic |
| | access control and content filtering | squid |
| | can also occur | |
+-----------+----------------------------------------+------------------------+
| web server| A service exposing web resources | IIS |
| | | apache |
+-----------+----------------------------------------+------------------------+
pylogsparser-0.4/LICENSE 0000644 0001750 0001750 00000063642 11627706147 013240 0 ustar fbo fbo GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
[This is the first released version of the Lesser GPL. It also counts
as the successor of the GNU Library Public License, version 2, hence
the version number 2.1.]
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some
specially designated software packages--typically libraries--of the
Free Software Foundation and other authors who decide to use it. You
can use it too, but we suggest you first think carefully about whether
this license or the ordinary General Public License is the better
strategy to use in any particular case, based on the explanations below.
When we speak of free software, we are referring to freedom of use,
not price. Our General Public Licenses are designed to make sure that
you have the freedom to distribute copies of free software (and charge
for this service if you wish); that you receive source code or can get
it if you want it; that you can change the software and use pieces of
it in new free programs; and that you are informed that you can do
these things.
To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights. These restrictions translate to certain responsibilities for
you if you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you. You must make sure that they, too, receive or can get the source
code. If you link other code with the library, you must provide
complete object files to the recipients, so that they can relink them
with the library after making changes to the library and recompiling
it. And you must show them these terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that
there is no warranty for the free library. Also, if the library is
modified by someone else and passed on, the recipients should know
that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.
Finally, software patents pose a constant threat to the existence of
any free program. We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder. Therefore, we insist that
any patent license obtained for a version of the library must be
consistent with the full freedom of use specified in this license.
Most GNU software, including some libraries, is covered by the
ordinary GNU General Public License. This license, the GNU Lesser
General Public License, applies to certain designated libraries, and
is quite different from the ordinary General Public License. We use
this license for certain libraries in order to permit linking those
libraries into non-free programs.
When a program is linked with a library, whether statically or using
a shared library, the combination of the two is legally speaking a
combined work, a derivative of the original library. The ordinary
General Public License therefore permits such linking only if the
entire combination fits its criteria of freedom. The Lesser General
Public License permits more lax criteria for linking other code with
the library.
We call this license the "Lesser" General Public License because it
does Less to protect the user's freedom than the ordinary General
Public License. It also provides other free software developers Less
of an advantage over competing non-free programs. These disadvantages
are the reason we use the ordinary General Public License for many
libraries. However, the Lesser license provides advantages in certain
special circumstances.
For example, on rare occasions, there may be a special need to
encourage the widest possible use of a certain library, so that it becomes
a de-facto standard. To achieve this, non-free programs must be
allowed to use the library. A more frequent case is that a free
library does the same job as widely used non-free libraries. In this
case, there is little to gain by limiting the free library to free
software only, so we use the Lesser General Public License.
In other cases, permission to use a particular library in non-free
programs enables a greater number of people to use a large body of
free software. For example, permission to use the GNU C Library in
non-free programs enables many more people to use the whole GNU
operating system, as well as its variant, the GNU/Linux operating
system.
Although the Lesser General Public License is Less protective of the
users' freedom, it does ensure that the user of a program that is
linked with the Library has the freedom and the wherewithal to run
that program using a modified version of the Library.
The precise terms and conditions for copying, distribution and
modification follow. Pay close attention to the difference between a
"work based on the library" and a "work that uses the library". The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.
GNU LESSER GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library or other
program which contains a notice placed by the copyright holder or
other authorized party saying it may be distributed under the terms of
this Lesser General Public License (also called "this License").
Each licensee is addressed as "you".
A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work
which has been distributed under these terms. A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language. (Hereinafter, translation is
included without limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for
making modifications to it. For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control compilation
and installation of the library.
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it). Whether that is true depends on what the Library does
and what the program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.
You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.
2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices
stating that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no
charge to all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a
table of data to be supplied by an application program that uses
the facility, other than as an argument passed when the facility
is invoked, then you must make a good faith effort to ensure that,
in the event an application does not supply such function or
table, the facility still operates, and performs whatever part of
its purpose remains meaningful.
(For example, a function in a library to compute square roots has
a purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must
be optional: if the application does not supply it, the square
root function must still compute square roots.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.
In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in
these notices.
Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.
If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.
5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library". Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also combine or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.
You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License. Also, you must do one
of these things:
a) Accompany the work with the complete corresponding
machine-readable source code for the Library including whatever
changes were used in the work (which must be distributed under
Sections 1 and 2 above); and, if the work is an executable linked
with the Library, with the complete machine-readable "work that
uses the Library", as object code and/or source code, so that the
user can modify the Library and then relink to produce a modified
executable containing the modified Library. (It is understood
that the user who changes the contents of definitions files in the
Library will not necessarily be able to recompile the application
to use the modified definitions.)
b) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (1) uses at run time a
copy of the library already present on the user's computer system,
rather than copying library functions into the executable, and (2)
will operate properly with a modified version of the library, if
the user installs one, as long as the modified version is
interface-compatible with the version that the work was made with.
c) Accompany the work with a written offer, valid for at
least three years, to give the same user the materials
specified in Subsection 6a, above, for a charge no more
than the cost of performing this distribution.
d) If distribution of the work is made by offering access to copy
from a designated place, offer equivalent access to copy the above
specified materials from the same place.
e) Verify that the user has already received a copy of these
materials or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it. However, as a special exception,
the materials to be distributed need not include anything that is
normally distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.
It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system. Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.
7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:
a) Accompany the combined library with a copy of the same work
based on the Library, uncombined with any other library
facilities. This must be distributed under the terms of the
Sections above.
b) Give prominent notice with the combined library of the fact
that part of it is a work based on the Library, and explaining
where to find the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License. Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License. However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.
9. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Library or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.
10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties with
this License.
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License may add
an explicit geographical distribution limitation excluding those countries,
so that distribution is permitted only in or among countries not thus
excluded. In such case, this License incorporates the limitation as if
written in the body of this License.
13. The Free Software Foundation may publish revised and/or new
versions of the Lesser General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this. Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Libraries
If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change. You can do so by permitting
redistribution under these terms (or, alternatively, under the terms of the
ordinary General Public License).
To apply these terms, attach the following notices to the library. It is
safest to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.
Copyright (C)
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Also add information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the
library `Frob' (a library for tweaking knobs) written by James Random Hacker.
, 1 April 1990
Ty Coon, President of Vice
That's all there is to it!
pylogsparser-0.4/tests/ 0000755 0001750 0001750 00000000000 11715707344 013360 5 ustar fbo fbo pylogsparser-0.4/tests/test_norm_chain_speed.py 0000644 0001750 0001750 00000003445 11627706151 020271 0 ustar fbo fbo # -*- python -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix Inc.
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
import os
import timeit
from logsparser.lognormalizer import LogNormalizer
if __name__ == "__main__":
path = os.environ['NORMALIZERS_PATH']
ln = LogNormalizer(path)
def test():
l = {'raw' : "<29>Jul 18 08:55:35 naruto squid[3245]: 1259844091.407 307 82.238.42.70 TCP_MISS/200 1015 GET http://www.ietf.org/css/ietf.css fbo DIRECT/64.170.98.32 text/css" }
l = ln.uuidify(l)
ln.normalize(l)
print "Testing speed ..."
t = timeit.Timer("test()", "from __main__ import test")
speed = t.timeit(100000)/100000
print "%.2f microseconds per pass, giving a theoretical speed of %i logs/s." % (speed * 1000000, 1 / speed)
print "Testing speed with minimal normalization ..."
ln.set_active_normalizers({'syslog' : True})
ln.reload()
t = timeit.Timer("test()", "from __main__ import test")
speed = t.timeit(100000)/100000
print "%.2f microseconds per pass, giving a theoretical speed of %i logs/s." % (speed * 1000000, 1 / speed)
pylogsparser-0.4/tests/test_normalizer.py 0000644 0001750 0001750 00000044522 11710267252 017154 0 ustar fbo fbo # -*- python -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix Inc.
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
import os
import unittest
from datetime import datetime
from logsparser.normalizer import Normalizer, TagType, Tag, CallbackFunction, CSVPattern, get_generic_tagTypes
from lxml.etree import parse, DTD
from StringIO import StringIO
class TestSample(unittest.TestCase):
"""Unit tests for logsparser.normalize. Validate sample log example"""
normalizer_path = os.environ['NORMALIZERS_PATH']
def normalize_samples(self, norm, name, version):
"""Test logparser.normalize validate for syslog normalizer."""
# open parser
n = parse(open(os.path.join(self.normalizer_path, norm)))
# validate DTD
dtd = DTD(open(os.path.join(self.normalizer_path,
'normalizer.dtd')))
dtd.assertValid(n)
# Create normalizer from xml definition
normalizer = Normalizer(n, os.path.join(self.normalizer_path, 'common_tagTypes.xml'), os.path.join(self.normalizer_path, 'common_callBacks.xml'))
self.assertEquals(normalizer.name, name)
self.assertEquals(normalizer.version, version)
self.assertTrue(normalizer.validate())
def test_normalize_samples_001_syslog(self):
self.normalize_samples('syslog.xml', 'syslog', 0.99)
def test_normalize_samples_002_apache(self):
self.normalize_samples('apache.xml', 'apache', 0.99)
def test_normalize_samples_003_dhcpd(self):
self.normalize_samples('dhcpd.xml', 'DHCPd', 0.99)
def test_normalize_samples_004_lea(self):
self.normalize_samples('LEA.xml', 'LEA', 0.99)
def test_normalize_samples_005_netfilter(self):
self.normalize_samples('netfilter.xml', 'netfilter', 0.99)
def test_normalize_samples_006_pam(self):
self.normalize_samples('pam.xml', 'PAM', 0.99)
def test_normalize_samples_007_postfix(self):
self.normalize_samples('postfix.xml', 'postfix', 0.99)
def test_normalize_samples_008_squid(self):
self.normalize_samples('squid.xml', 'squid', 0.99)
def test_normalize_samples_009_sshd(self):
self.normalize_samples('sshd.xml', 'sshd', 0.99)
def test_normalize_samples_010_named(self):
self.normalize_samples('named.xml', 'named', 0.99)
def test_normalize_samples_011_named2(self):
self.normalize_samples('named-2.xml', 'named-2', 0.99)
def test_normalize_samples_012_symantec(self):
self.normalize_samples('symantec.xml', 'symantec', 0.99)
def test_normalize_samples_013_msexchange2007MTL(self):
self.normalize_samples('MSExchange2007MessageTracking.xml', 'MSExchange2007MessageTracking', 0.99)
def test_normalize_samples_014_arkoonfast360(self):
self.normalize_samples('arkoonFAST360.xml', 'arkoonFAST360', 0.99)
def test_normalize_samples_015_s3(self):
self.normalize_samples('s3.xml', 's3', 0.99)
def test_normalize_samples_016_snare(self):
self.normalize_samples('snare.xml', 'snare', 0.99)
def test_normalize_samples_017_vmware(self):
self.normalize_samples('VMWare_ESX4-ESXi4.xml', 'VMWare_ESX4-ESXi4', 0.99)
# def test_normalize_samples_018_mysql(self):
# self.normalize_samples('mysql.xml', 'mysql', 0.99)
def test_normalize_samples_019_IIS(self):
self.normalize_samples('IIS.xml', 'IIS', 0.99)
def test_normalize_samples_020_fail2ban(self):
self.normalize_samples('Fail2ban.xml', 'Fail2ban', 0.99)
def test_normalize_samples_021_GeoIPsource(self):
try:
import GeoIP #pyflakes:ignore
self.normalize_samples('GeoIPsource.xml', 'GeoIPsource', 0.99)
except ImportError:
# cannot test
pass
def test_normalize_samples_022_URL_parsers(self):
self.normalize_samples('URLparser.xml', 'URLparser', 0.99)
self.normalize_samples('RefererParser.xml', 'RefererParser', 0.99)
def test_normalize_samples_023_bitdefender(self):
self.normalize_samples('bitdefender.xml', 'bitdefender', 0.99)
def test_normalize_samples_024_denyall_traffic(self):
self.normalize_samples('deny_traffic.xml', 'deny_traffic', 0.99)
def test_normalize_samples_025_denyall_event(self):
self.normalize_samples('deny_event.xml', 'deny_event', 0.99)
def test_normalize_samples_026_xferlog(self):
self.normalize_samples('xferlog.xml', 'xferlog', 0.99)
def test_normalize_samples_027_wabauth(self):
self.normalize_samples('wabauth.xml', 'wabauth', 0.99)
def test_normalize_samples_028_dansguardian(self):
self.normalize_samples('dansguardian.xml', 'dansguardian', 0.99)
def test_normalize_samples_029_cisco_asa_header(self):
self.normalize_samples('cisco-asa_header.xml', 'cisco-asa_header', 0.99)
def test_normalize_samples_030_cisco_asa_msg(self):
self.normalize_samples('cisco-asa_msg.xml', 'cisco-asa_msg', 0.99)
class TestCSVPattern(unittest.TestCase):
"""Test CSVPattern behaviour"""
normalizer_path = os.environ['NORMALIZERS_PATH']
tt1 = TagType(name='Anything', ttype=str, regexp='.*')
tt2 = TagType(name='SyslogDate', ttype=datetime,
regexp='[A-Z][a-z]{2} [ 0-9]\d \d{2}:\d{2}:\d{2}')
tag_types = {}
for tt in (tt1, tt2):
tag_types[tt.name] = tt
generic_tagTypes = get_generic_tagTypes(path = os.path.join(normalizer_path,
'common_tagTypes.xml'))
cb_syslogdate = CallbackFunction("""
MONTHS = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
now = datetime.now()
currentyear = now.year
# Following line may throw a lot of ValueError
newdate = datetime(currentyear,
MONTHS.index(value[0:3]) + 1,
int(value[4:6]),
int(value[7:9]),
int(value[10:12]),
int(value[13:15]))
log["date"] = newdate
""", name = 'formatsyslogdate')
def test_normalize_csv_pattern_001(self):
t1 = Tag(name='date',
tagtype = 'Anything',
substitute = 'DATE')
t2 = Tag(name='id',
tagtype = 'Anything',
substitute = 'ID')
t3 = Tag(name='msg',
tagtype = 'Anything',
substitute = 'MSG')
p_tags = {}
for t in (t1, t2, t3):
p_tags[t.name] = t
p = CSVPattern('test', 'DATE,ID,MSG', tags = p_tags, tagTypes = self.tag_types, genericTagTypes = self.generic_tagTypes)
ret = p.normalize('Jul 18 08:55:35,83,"start listening on 127.0.0.1, pam auth started"')
self.assertEqual(ret['date'], 'Jul 18 08:55:35')
self.assertEqual(ret['id'], '83')
self.assertEqual(ret['msg'], 'start listening on 127.0.0.1, pam auth started')
def test_normalize_csv_pattern_002(self):
t1 = Tag(name='date',
tagtype = 'SyslogDate',
substitute = 'DATE')
t2 = Tag(name='id',
tagtype = 'Anything',
substitute = 'ID')
t3 = Tag(name='msg',
tagtype = 'Anything',
substitute = 'MSG')
p_tags = {}
for t in (t1, t2, t3):
p_tags[t.name] = t
p = CSVPattern('test', 'DATE,ID,MSG', tags = p_tags, tagTypes = self.tag_types, genericTagTypes = self.generic_tagTypes)
ret = p.normalize('Jul 18 08:55:35,83,"start listening on 127.0.0.1, pam auth started"')
self.assertEqual(ret['date'], 'Jul 18 08:55:35')
self.assertEqual(ret['id'], '83')
self.assertEqual(ret['msg'], 'start listening on 127.0.0.1, pam auth started')
ret = p.normalize('2011 Jul 18 08:55:35,83,"start listening on 127.0.0.1, pam auth started"')
self.assertEqual(ret, None)
def test_normalize_csv_pattern_003(self):
t1 = Tag(name='date',
tagtype = 'SyslogDate',
substitute = 'DATE',
callbacks = ['formatsyslogdate'])
t2 = Tag(name='id',
tagtype = 'Anything',
substitute = 'ID')
t3 = Tag(name='msg',
tagtype = 'Anything',
substitute = 'MSG')
p_tags = {}
for t in (t1, t2, t3):
p_tags[t.name] = t
p = CSVPattern('test', 'DATE,ID,MSG', tags = p_tags,
tagTypes = self.tag_types, callBacks = {self.cb_syslogdate.name:self.cb_syslogdate},
genericTagTypes = self.generic_tagTypes)
ret = p.normalize('Jul 18 08:55:35,83,"start listening on 127.0.0.1, pam auth started"')
self.assertEqual(ret['date'], datetime(datetime.now().year, 7, 18, 8, 55, 35))
self.assertEqual(ret['id'], '83')
self.assertEqual(ret['msg'], 'start listening on 127.0.0.1, pam auth started')
def test_normalize_csv_pattern_004(self):
t1 = Tag(name='date',
tagtype = 'Anything',
substitute = 'DATE')
t2 = Tag(name='id',
tagtype = 'Anything',
substitute = 'ID')
t3 = Tag(name='msg',
tagtype = 'Anything',
substitute = 'MSG')
p_tags = {}
for t in (t1, t2, t3):
p_tags[t.name] = t
p = CSVPattern('test', ' DATE; ID ;MSG ', separator = ';', quotechar = '=', tags = p_tags, tagTypes = self.tag_types, genericTagTypes = self.generic_tagTypes)
ret = p.normalize('Jul 18 08:55:35;83;=start listening on 127.0.0.1; pam auth started=')
self.assertEqual(ret['date'], 'Jul 18 08:55:35')
self.assertEqual(ret['id'], '83')
self.assertEqual(ret['msg'], 'start listening on 127.0.0.1; pam auth started')
def test_normalize_csv_pattern_005(self):
t1 = Tag(name='date',
tagtype = 'Anything',
substitute = 'DATE')
t2 = Tag(name='id',
tagtype = 'Anything',
substitute = 'ID')
t3 = Tag(name='msg',
tagtype = 'Anything',
substitute = 'MSG')
p_tags = {}
for t in (t1, t2, t3):
p_tags[t.name] = t
p = CSVPattern('test', 'DATE ID MSG', separator = ' ', quotechar = '=', tags = p_tags, tagTypes = self.tag_types, genericTagTypes = self.generic_tagTypes)
ret = p.normalize('=Jul 18 08:55:35= 83 =start listening on 127.0.0.1 pam auth started=')
self.assertEqual(ret['date'], 'Jul 18 08:55:35')
self.assertEqual(ret['id'], '83')
self.assertEqual(ret['msg'], 'start listening on 127.0.0.1 pam auth started')
def test_normalize_csv_pattern_006(self):
t1 = Tag(name='date',
tagtype = 'Anything',
substitute = 'DATE')
t2 = Tag(name='id',
tagtype = 'Anything',
substitute = 'ID')
t3 = Tag(name='msg',
tagtype = 'Anything',
substitute = 'MSG')
p_tags = {}
for t in (t1, t2, t3):
p_tags[t.name] = t
p = CSVPattern('test', 'DATE ID MSG', separator = ' ', quotechar = '=', tags = p_tags, tagTypes = self.tag_types, genericTagTypes = self.generic_tagTypes)
# Default behaviour of csv reader is doublequote for escape a quotechar.
ret = p.normalize('=Jul 18 08:55:35= 83 =start listening on ==127.0.0.1 pam auth started=')
self.assertEqual(ret['date'], 'Jul 18 08:55:35')
self.assertEqual(ret['id'], '83')
self.assertEqual(ret['msg'], 'start listening on =127.0.0.1 pam auth started')
class TestCommonElementsPrecedence(unittest.TestCase):
"""Unit test used to validate that callbacks defined in a normalizer
take precedence over common callbacks."""
normalizer_path = os.environ['NORMALIZERS_PATH']
fake_syslog = StringIO("""
Uh
Ah
mhu@wallix.com
Oh
Eh
\d{1,3}
log["TEST"] = "TEST"
Hoo
Hi
MYMAC MYWHATEVER
the log's priority
urrrh
MYMAC
the log's date
bleeeh
MYWHATEVER
MMM dd hh:mm:ss
99 HERPA DERP
99
TEST
""")
n = parse(fake_syslog)
def test_00_validate_fake_syslog(self):
"""Validate the fake normalizer"""
dtd = DTD(open(os.path.join(self.normalizer_path,
'normalizer.dtd')))
self.assertTrue(dtd.validate(self.n))
def test_10_common_elements_precedence(self):
"""Testing callbacks priority"""
normalizer = Normalizer(self.n,
os.path.join(self.normalizer_path, 'common_tagTypes.xml'),
os.path.join(self.normalizer_path, 'common_callBacks.xml'))
self.assertTrue(normalizer.validate())
class TestFinalCallbacks(unittest.TestCase):
"""Unit test used to validate FinalCallbacks"""
normalizer_path = os.environ['NORMALIZERS_PATH']
fake_syslog = StringIO("""
Uh
Ah
mhu@wallix.com
Oh
Eh
[a-zA-Z]
log["toto"] = log["a"] + log["b"]
if not value:
log["tata"] = log["toto"] * 2
else:
log["tata"] = log["toto"] * 3
log['b'] = value * 2
Hoo
Hi
A B C
the log's priority
urrrh
A
the log's date
bleeeh
B
tutu
the log's priority
urrrh
C
a b c
a
bb
c
abb
abbabb
toto
tata
""")
n = parse(fake_syslog)
def test_00_validate_fake_syslog(self):
"""Validate the fake normalizer"""
dtd = DTD(open(os.path.join(self.normalizer_path,
'normalizer.dtd')))
self.assertTrue(dtd.validate(self.n))
def test_10_final_callbacks(self):
"""Testing final callbacks"""
normalizer = Normalizer(self.n,
os.path.join(self.normalizer_path, 'common_tagTypes.xml'),
os.path.join(self.normalizer_path, 'common_callBacks.xml'))
self.assertTrue(['toto', 'tata'] == normalizer.finalCallbacks)
self.assertTrue(normalizer.validate())
if __name__ == "__main__":
unittest.main()
pylogsparser-0.4/tests/test_suite.py 0000644 0001750 0001750 00000002650 11700571003 016106 0 ustar fbo fbo # -*- python -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix Inc.
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
""" The LogNormalizer need to be instanciated with the path to
normalizers XML definitions.
Tests expects to find normlizer path in NORMALIZERS_PATH environment variable.
$ NORMALIZERS_PATH=normalizers/ python tests/test_suite.py
"""
import unittest
import test_normalizer
import test_lognormalizer
import test_log_samples
import test_commonElements
tests = (test_commonElements,
test_normalizer,
test_lognormalizer,
test_log_samples,
)
load = unittest.defaultTestLoader.loadTestsFromModule
suite = unittest.TestSuite(map(load, tests))
unittest.TextTestRunner(verbosity=2).run(suite)
pylogsparser-0.4/tests/__init__.py 0000644 0001750 0001750 00000000000 11627706151 015454 0 ustar fbo fbo pylogsparser-0.4/tests/test_lognormalizer.py 0000644 0001750 0001750 00000016276 11700571003 017652 0 ustar fbo fbo # -*- python -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix Inc.
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
import os
import unittest
import tempfile
import shutil
from logsparser.lognormalizer import LogNormalizer
from lxml.etree import parse, fromstring as XMLfromstring
class Test(unittest.TestCase):
"""Unit tests for logsparser.lognormalizer"""
normalizer_path = os.environ['NORMALIZERS_PATH']
def test_000_invalid_paths(self):
"""Verify that we cannot instanciate LogNormalizer on invalid paths"""
def bleh(paths):
n = LogNormalizer(paths)
return n
self.assertRaises(ValueError, bleh, [self.normalizer_path, "/path/to/nowhere"])
self.assertRaises(ValueError, bleh, ["/path/to/nowhere",])
self.assertRaises(StandardError, bleh, ["/usr/bin/",])
def test_001_all_normalizers_activated(self):
""" Verify that we have all normalizer
activated when we instanciate LogNormalizer with
an activate dict empty.
"""
ln = LogNormalizer(self.normalizer_path)
self.assertTrue(len(ln))
self.assertEqual(len([an[0] for an in ln.get_active_normalizers() if an[1]]), len(ln))
self.assertEqual(len(ln._cache), len(ln))
def test_002_deactivate_normalizer(self):
""" Verify that normalizer deactivation is working.
"""
ln = LogNormalizer(self.normalizer_path)
active_n = ln.get_active_normalizers()
to_deactivate = active_n.keys()[:2]
for to_d in to_deactivate:
del active_n[to_d]
ln.set_active_normalizers(active_n)
ln.reload()
self.assertEqual(len([an[0] for an in ln.get_active_normalizers().items() if an[1]]), len(ln)-2)
self.assertEqual(len(ln._cache), len(ln)-2)
def test_003_activate_normalizer(self):
""" Verify that normalizer activation is working.
"""
ln = LogNormalizer(self.normalizer_path)
active_n = ln.get_active_normalizers()
to_deactivate = active_n.keys()[0]
to_activate = to_deactivate
del active_n[to_deactivate]
ln.set_active_normalizers(active_n)
ln.reload()
# now deactivation should be done so reactivate
active_n[to_activate] = True
ln.set_active_normalizers(active_n)
ln.reload()
self.assertEqual(len([an[0] for an in ln.get_active_normalizers() if an[1]]), len(ln))
self.assertEqual(len(ln._cache), len(ln))
def test_004_normalizer_uuid(self):
""" Verify that we get at least uuid tag
"""
testlog = {'raw': 'a minimal log line'}
ln = LogNormalizer(self.normalizer_path)
ln.lognormalize(testlog)
self.assertTrue('uuid' in testlog.keys())
def test_005_normalizer_test_a_syslog_log(self):
""" Verify that lognormalizer extracts
syslog header as tags
"""
testlog = {'raw': 'Jul 18 08:55:35 naruto app[3245]: body message'}
ln = LogNormalizer(self.normalizer_path)
ln.lognormalize(testlog)
self.assertTrue('uuid' in testlog.keys())
self.assertTrue('date' in testlog.keys())
self.assertEqual(testlog['body'], 'body message')
self.assertEqual(testlog['program'], 'app')
self.assertEqual(testlog['pid'], '3245')
def test_006_normalizer_test_a_syslog_log_with_syslog_deactivate(self):
""" Verify that lognormalizer does not extract
syslog header as tags when syslog normalizer is deactivated.
"""
testlog = {'raw': 'Jul 18 08:55:35 naruto app[3245]: body message'}
ln = LogNormalizer(self.normalizer_path)
active_n = ln.get_active_normalizers()
to_deactivate = [n for n in active_n.keys() if n.find('syslog') >= 0]
for n in to_deactivate:
del active_n[n]
ln.set_active_normalizers(active_n)
ln.reload()
ln.lognormalize(testlog)
self.assertTrue('uuid' in testlog.keys())
self.assertFalse('date' in testlog.keys())
self.assertFalse('program' in testlog.keys())
def test_007_normalizer_getsource(self):
""" Verify we can retreive XML source
of a normalizer.
"""
ln = LogNormalizer(self.normalizer_path)
source = ln.get_normalizer_source('syslog-0.99')
self.assertEquals(XMLfromstring(source).getroottree().getroot().get('name'), 'syslog')
def test_008_normalizer_multiple_paths(self):
""" Verify we can can deal with multiple normalizer paths.
"""
fdir = tempfile.mkdtemp()
sdir = tempfile.mkdtemp()
for f in os.listdir(self.normalizer_path):
path_f = os.path.join(self.normalizer_path, f)
if os.path.isfile(path_f):
shutil.copyfile(path_f, os.path.join(fdir, f))
shutil.move(os.path.join(fdir, 'postfix.xml'),
os.path.join(sdir, 'postfix.xml'))
ln = LogNormalizer([fdir, sdir])
source = ln.get_normalizer_source('postfix-0.99')
self.assertEquals(XMLfromstring(source).getroottree().getroot().get('name'), 'postfix')
self.assertTrue(ln.get_normalizer_path('postfix-0.99').startswith(sdir))
self.assertTrue(ln.get_normalizer_path('syslog-0.99').startswith(fdir))
xml_src = ln.get_normalizer_source('syslog-0.99')
os.unlink(os.path.join(fdir, 'syslog.xml'))
ln.reload()
self.assertRaises(ValueError, ln.get_normalizer_path, 'syslog-0.99')
ln.update_normalizer(xml_src, dir_path = sdir)
self.assertTrue(ln.get_normalizer_path('syslog-0.99').startswith(sdir))
shutil.rmtree(fdir)
shutil.rmtree(sdir)
def test_009_normalizer_multiple_version(self):
""" Verify we can can deal with a normalizer with more than one version.
"""
fdir = tempfile.mkdtemp()
shutil.copyfile(os.path.join(self.normalizer_path, 'postfix.xml'),
os.path.join(fdir, 'postfix.xml'))
# Change normalizer version in fdir path
xml = parse(os.path.join(fdir, 'postfix.xml'))
xmln = xml.getroot()
xmln.set('version', '1.0')
xml.write(os.path.join(fdir, 'postfix.xml'))
ln = LogNormalizer([self.normalizer_path, fdir])
self.assertEquals(XMLfromstring(ln.get_normalizer_source('postfix-0.99')).getroottree().getroot().get('version'), '0.99')
self.assertEquals(XMLfromstring(ln.get_normalizer_source('postfix-1.0')).getroottree().getroot().get('version'), '1.0')
shutil.rmtree(fdir)
if __name__ == "__main__":
unittest.main()
pylogsparser-0.4/tests/test_log_samples.py 0000644 0001750 0001750 00000130050 11710522056 017263 0 ustar fbo fbo # -*- python -*-
# -*- coding: utf-8 -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix Inc.
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
"""Testing that normalization work as excepted
Here you can add samples logs to test existing or new normalizers.
In addition to examples validation defined in each normalizer xml definition
you should add validation tests here.
In this test all normalizer definitions are loaded and therefore
it is useful to detect normalization conflicts.
"""
import os
import unittest
from datetime import datetime
from logsparser import lognormalizer
normalizer_path = os.environ['NORMALIZERS_PATH']
ln = lognormalizer.LogNormalizer(normalizer_path)
class Test(unittest.TestCase):
def aS(self, log, subset, notexpected = ()):
"""Assert that the result of normalization of a given line log has the given subset."""
data = {'raw' : log,
'body' : log}
ln.lognormalize(data)
for key in subset:
self.assertEqual(data[key], subset[key])
for key in notexpected:
self.assertFalse(key in data.keys())
def test_simple_syslog(self):
"""Test syslog logs"""
now = datetime.now()
self.aS("<40>%s neo kernel: tun_wallix: Disabled Privacy Extensions" % now.strftime("%b %d %H:%M:%S"),
{'body': 'tun_wallix: Disabled Privacy Extensions',
'severity': 'emerg',
'severity_code' : '0',
'facility': 'syslog',
'facility_code' : '5',
'source': 'neo',
'program': 'kernel',
'date': now.replace(microsecond=0)})
self.aS("<40>%s fbo sSMTP[8847]: Cannot open mail:25" % now.strftime("%b %d %H:%M:%S"),
{'body': 'Cannot open mail:25',
'severity': 'emerg',
'severity_code' : '0',
'facility': 'syslog',
'facility_code' : '5',
'source': 'fbo',
'program': 'sSMTP',
'pid': '8847',
'date': now.replace(microsecond=0)})
self.aS("%s fbo sSMTP[8847]: Cannot open mail:25" % now.strftime("%b %d %H:%M:%S"),
{'body': 'Cannot open mail:25',
'source': 'fbo',
'program': 'sSMTP',
'pid': '8847',
'date': now.replace(microsecond=0)})
now = now.replace(month=now.month%12+1, day=1)
self.aS("<40>%s neo kernel: tun_wallix: Disabled Privacy Extensions" % now.strftime("%b %d %H:%M:%S"),
{'date': now.replace(microsecond=0, year=now.year-1),
'body': 'tun_wallix: Disabled Privacy Extensions',
'severity': 'emerg',
'severity_code' : '0',
'facility': 'syslog',
'facility_code' : '5',
'source': 'neo',
'program': 'kernel' })
def test_postfix(self):
"""Test postfix logs"""
self.aS("<40>Dec 21 07:49:02 hosting03 postfix/cleanup[23416]: 2BD731B4017: message-id=<20071221073237.5244419B327@paris.office.wallix.com>",
{'program': 'postfix',
'component': 'cleanup',
'queue_id': '2BD731B4017',
'pid': '23416',
'message_id': '20071221073237.5244419B327@paris.office.wallix.com'})
# self.aS("<40>Dec 21 07:49:01 hosting03 postfix/anvil[32717]: statistics: max connection rate 2/60s for (smtp:64.14.54.229) at Dec 21 07:40:04",
# {'program': 'postfix',
# 'component': 'anvil',
# 'pid': '32717'})
#
self.aS("<40>Dec 21 07:49:01 hosting03 postfix/pipe[23417]: 1E83E1B4017: to=, relay=vmail, delay=0.13, delays=0.11/0/0/0.02, dsn=2.0.0, status=sent (delivered via vmail service)",
{'program': 'postfix',
'component': 'pipe',
'queue_id': '1E83E1B4017',
'message_recipient': 'gloubi@wallix.com',
'relay': 'vmail',
'dest_host': 'vmail',
'status': 'sent'})
self.aS("<40>Dec 21 07:49:04 hosting03 postfix/smtpd[23446]: C43971B4019: client=paris.office.wallix.com[82.238.42.70]",
{'program': 'postfix',
'component': 'smtpd',
'queue_id': 'C43971B4019',
'client': 'paris.office.wallix.com[82.238.42.70]',
'source_host': 'paris.office.wallix.com',
'source_ip': '82.238.42.70'})
# self.aS("<40>Dec 21 07:52:56 hosting03 postfix/smtpd[23485]: connect from mail.gloubi.com[65.45.12.22]",
# {'program': 'postfix',
# 'component': 'smtpd',
# 'ip': '65.45.12.22'})
self.aS("<40>Dec 21 08:42:17 hosting03 postfix/pipe[26065]: CEFFB1B4020: to=, orig_to=, relay=vacation, delay=4.1, delays=4/0/0/0.08, dsn=2.0.0, status=sent (delivered via vacation service)",
{'program': 'postfix',
'component': 'pipe',
'message_recipient': 'gloubi@wallix.com@autoreply.wallix.com',
'orig_to': 'gloubi@wallix.com',
'relay': 'vacation',
'dest_host': 'vacation',
'status': 'sent'})
def test_squid(self):
"""Test squid logs"""
self.aS("<40>Dec 21 07:49:02 hosting03 squid[54]: 1196341497.777 784 127.0.0.1 TCP_MISS/200 106251 GET http://fr.yahoo.com/ vbe DIRECT/217.146.186.51 text/html",
{ 'program': 'squid',
'date': datetime(2007, 11, 29, 13, 4, 57, 777000),
'elapsed': '784',
'source_ip': '127.0.0.1',
'event_id': 'TCP_MISS',
'status': '200',
'len': '106251',
'method': 'GET',
'url': 'http://fr.yahoo.com/',
'user': 'vbe' })
self.aS("<40>Dec 21 07:49:02 hosting03 : 1196341497.777 784 127.0.0.1 TCP_MISS/404 106251 GET http://fr.yahoo.com/gjkgf/gfgff/ - DIRECT/217.146.186.51 text/html",
{ 'program': 'squid',
'date': datetime(2007, 11, 29, 13, 4, 57, 777000),
'elapsed': '784',
'source_ip': '127.0.0.1',
'event_id': 'TCP_MISS',
'status': '404',
'len': '106251',
'method': 'GET',
'url': 'http://fr.yahoo.com/gjkgf/gfgff/' })
self.aS("Oct 22 01:27:16 pluto squid: 1259845087.188 10 82.238.42.70 TCP_MISS/200 13121 GET http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/VABT.swf?url_download=&width=300&height=250&vidw=300&vidh=250&startbbanner=http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_in.swf&endbanner=http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_out.swf&video_hd=http://aak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_hd.flv&video_md=http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_md.flv&video_bd=http://ak.bluestreak.comm//adv/sig/%5E16238/%5E7451318/vdo_300x250_bd.flv&url_tracer=http%3A//s0b.bluestreak.com/ix.e%3Fpx%26s%3D8008666%26a%3D7451318%26t%3D&start=2&duration1=3&duration2=4&duration3=5&durration4=6&duration5=7&end=8&hd=9&md=10&bd=11&gif=12&hover1=13&hover2=14&hover3=15&hover4=16&hover5=17&hover6=18&replay=19&sound_state=off&debug=0&playback_controls=off&tracking_objeect=tracking_object_8008666&url=javascript:bluestreak8008666_clic();&rnd=346.2680651591202 fbo DIRECT/92.123.65.129 application/x-shockwave-flash",
{'program' : "squid",
'date' : datetime.utcfromtimestamp(float(1259845087.188)),
'elapsed' : "10",
'source_ip' : "82.238.42.70",
'event_id' : "TCP_MISS",
'status' : "200",
'len' : "13121",
'method' : "GET",
'user' : "fbo",
'peer_status' : "DIRECT",
'peer_host' : "92.123.65.129",
'mime_type' : "application/x-shockwave-flash",
'url' : "http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/VABT.swf?url_download=&width=300&height=250&vidw=300&vidh=250&startbbanner=http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_in.swf&endbanner=http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_out.swf&video_hd=http://aak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_hd.flv&video_md=http://ak.bluestreak.com//adv/sig/%5E16238/%5E7451318/vdo_300x250_md.flv&video_bd=http://ak.bluestreak.comm//adv/sig/%5E16238/%5E7451318/vdo_300x250_bd.flv&url_tracer=http%3A//s0b.bluestreak.com/ix.e%3Fpx%26s%3D8008666%26a%3D7451318%26t%3D&start=2&duration1=3&duration2=4&duration3=5&durration4=6&duration5=7&end=8&hd=9&md=10&bd=11&gif=12&hover1=13&hover2=14&hover3=15&hover4=16&hover5=17&hover6=18&replay=19&sound_state=off&debug=0&playback_controls=off&tracking_objeect=tracking_object_8008666&url=javascript:bluestreak8008666_clic();&rnd=346.2680651591202"})
def test_netfilter(self):
"""Test netfilter logs"""
self.aS("<40>Dec 26 09:30:07 dedibox kernel: FROM_INTERNET_DENY IN=eth0 OUT= MAC=00:40:63:e7:b2:17:00:15:fa:80:47:3f:08:00 SRC=88.252.4.37 DST=88.191.34.16 LEN=48 TOS=0x00 PREC=0x00 TTL=117 ID=56818 DF PROTO=TCP SPT=1184 DPT=445 WINDOW=65535 RES=0x00 SYN URGP=0",
{ 'program': 'netfilter',
'inbound_int': 'eth0',
'dest_mac': '00:40:63:e7:b2:17',
'source_mac': '00:15:fa:80:47:3f',
'source_ip': '88.252.4.37',
'dest_ip': '88.191.34.16',
'len': '48',
'protocol': 'TCP',
'source_port': '1184',
'prefix': 'FROM_INTERNET_DENY',
'dest_port': '445' })
self.aS("<40>Dec 26 08:45:23 dedibox kernel: TO_INTERNET_DENY IN=vif2.0 OUT=eth0 SRC=10.116.128.6 DST=82.225.197.239 LEN=121 TOS=0x00 PREC=0x00 TTL=63 ID=15592 DF PROTO=TCP SPT=993 DPT=56248 WINDOW=4006 RES=0x00 ACK PSH FIN URGP=0 ",
{ 'program': 'netfilter',
'inbound_int': 'vif2.0',
'outbound_int': 'eth0',
'source_ip': '10.116.128.6',
'dest_ip': '82.225.197.239',
'len': '121',
'protocol': 'TCP',
'source_port': '993',
'dest_port': '56248' })
# One malformed log
self.aS("<40>Dec 26 08:45:23 dedibox kernel: TO_INTERNET_DENY IN=vif2.0 OUT=eth0 DST=82.225.197.239 LEN=121 TOS=0x00 PREC=0x00 TTL=63 ID=15592 DF PROTO=TCP SPT=993 DPT=56248 WINDOW=4006 RES=0x00 ACK PSH FIN URGP=0 ",
{ 'program': 'kernel' },
('inbound_int', 'len'))
self.aS("Sep 28 15:19:59 tulipe-input kernel: [1655854.311830] DROPPED: IN=eth0 OUT= MAC=32:42:cd:02:72:30:00:23:7d:c6:35:e6:08:00 SRC=10.10.4.7 DST=10.10.4.86 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=20805 DF PROTO=TCP SPT=34259 DPT=111 WINDOW=5840 RES=0x00 SYN URGP=0",
{'program': 'netfilter',
'inbound_int' : "eth0",
'source_ip' : "10.10.4.7",
'dest_ip' : "10.10.4.86",
'len' : "60",
'protocol' : 'TCP',
'source_port' : '34259',
'dest_port' : '111',
'dest_mac' : '32:42:cd:02:72:30',
'source_mac' : '00:23:7d:c6:35:e6',
'prefix' : '[1655854.311830] DROPPED:' })
def test_dhcpd(self):
"""Test DHCPd log normalization"""
self.aS("<40>Dec 25 15:00:15 gnaganok dhcpd: DHCPDISCOVER from 02:1c:25:a3:32:76 via 183.213.184.122",
{ 'program': 'dhcpd',
'action': 'DISCOVER',
'source_mac': '02:1c:25:a3:32:76',
'via': '183.213.184.122' })
self.aS("<40>Dec 25 15:00:15 gnaganok dhcpd: DHCPDISCOVER from 02:1c:25:a3:32:76 via vlan18.5",
{ 'program': 'dhcpd',
'action': 'DISCOVER',
'source_mac': '02:1c:25:a3:32:76',
'via': 'vlan18.5' })
for log in [
"DHCPOFFER on 183.231.184.122 to 00:13:ec:1c:06:5b via 183.213.184.122",
"DHCPREQUEST for 183.231.184.122 from 00:13:ec:1c:06:5b via 183.213.184.122",
"DHCPACK on 183.231.184.122 to 00:13:ec:1c:06:5b via 183.213.184.122",
"DHCPNACK on 183.231.184.122 to 00:13:ec:1c:06:5b via 183.213.184.122",
"DHCPDECLINE of 183.231.184.122 from 00:13:ec:1c:06:5b via 183.213.184.122 (bla)",
"DHCPRELEASE of 183.231.184.122 from 00:13:ec:1c:06:5b via 183.213.184.122 for nonexistent lease" ]:
self.aS("<40>Dec 25 15:00:15 gnaganok dhcpd: %s" % log,
{ 'program': 'dhcpd',
'source_ip': '183.231.184.122',
'source_mac': '00:13:ec:1c:06:5b',
'via': '183.213.184.122' })
self.aS("<40>Dec 25 15:00:15 gnaganok dhcpd: DHCPINFORM from 183.231.184.122",
{ 'program': 'dhcpd',
'source_ip': '183.231.184.122',
'action': 'INFORM' })
def test_sshd(self):
"""Test SSHd normalization"""
self.aS("<40>Dec 26 10:32:40 naruto sshd[2274]: Failed password for bernat from 127.0.0.1 port 37234 ssh2",
{ 'program': 'sshd',
'action': 'fail',
'user': 'bernat',
'method': 'password',
'source_ip': '127.0.0.1' })
self.aS("<40>Dec 26 10:32:40 naruto sshd[2274]: Failed password for invalid user jfdghfg from 127.0.0.1 port 37234 ssh2",
{ 'program': 'sshd',
'action': 'fail',
'user': 'jfdghfg',
'method': 'password',
'source_ip': '127.0.0.1' })
self.aS("<40>Dec 26 10:32:40 naruto sshd[2274]: Failed none for invalid user kgjfk from 127.0.0.1 port 37233 ssh2",
{ 'program': 'sshd',
'action': 'fail',
'user': 'kgjfk',
'method': 'none',
'source_ip': '127.0.0.1' })
self.aS("<40>Dec 26 10:32:40 naruto sshd[2274]: Accepted password for bernat from 127.0.0.1 port 37234 ssh2",
{ 'program': 'sshd',
'action': 'accept',
'user': 'bernat',
'method': 'password',
'source_ip': '127.0.0.1' })
self.aS("<40>Dec 26 10:32:40 naruto sshd[2274]: Accepted publickey for bernat from 192.168.251.2 port 60429 ssh2",
{ 'program': 'sshd',
'action': 'accept',
'user': 'bernat',
'method': 'publickey',
'source_ip': '192.168.251.2' })
# See http://www.ossec.net/en/attacking-loganalysis.html
self.aS("<40>Dec 26 10:32:40 naruto sshd[2274]: Failed password for invalid user myfakeuser from 10.1.1.1 port 123 ssh2 from 192.168.50.65 port 34813 ssh2",
{ 'program': 'sshd',
'action': 'fail',
'user': 'myfakeuser from 10.1.1.1 port 123 ssh2',
'method': 'password',
'source_ip': '192.168.50.65' })
# self.aS("Aug 1 18:30:05 knight sshd[20439]: Illegal user guest from 218.49.183.17",
# {'program': 'sshd',
# 'source' : 'knight',
# 'user' : 'guest',
# 'source_ip': '218.49.183.17',
# 'body' : 'Illegal user guest from 218.49.183.17',
# })
def test_pam(self):
"""Test PAM normalization"""
self.aS("<40>Dec 26 10:32:25 s_all@naruto sshd[2263]: pam_unix(ssh:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost user=bernat",
{ 'program': 'ssh',
'component': 'pam_unix',
'type': 'auth',
'user': 'bernat' })
self.aS("<40>Dec 26 10:09:01 s_all@naruto CRON[2030]: pam_unix(cron:session): session opened for user root by (uid=0)",
{ 'program': 'cron',
'component': 'pam_unix',
'type': 'session',
'user': 'root' })
self.aS("<40>Dec 26 10:32:25 s_all@naruto sshd[2263]: pam_unix(ssh:auth): check pass; user unknown",
{ 'program': 'ssh',
'component': 'pam_unix',
'type': 'auth' })
# This one should be better handled
self.aS("<40>Dec 26 10:32:25 s_all@naruto sshd[2263]: pam_unix(ssh:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=localhost",
{ 'program': 'ssh',
'component': 'pam_unix',
'type': 'auth' })
def test_lea(self):
"""Test LEA normalization"""
self.aS("Oct 22 01:27:16 pluto lea: loc=7803|time=1199716450|action=accept|orig=fw1|i/f_dir=inbound|i/f_name=PCnet1|has_accounting=0|uuid=<47823861,00000253,7b040a0a,000007b6>|product=VPN-1 & FireWall-1|__policy_id_tag=product=VPN-1 & FireWall-1[db_tag={9F95C344-FE3F-4E3E-ACD8-60B5194BAAB4};mgmt=fw1;date=1199701916;policy_name=Standard]|src=naruto|s_port=36973|dst=fw1|service=941|proto=tcp|rule=1",
{'program' : 'lea',
'id' : "7803",
'action' : "accept",
'source_host' : "naruto",
'source_port' : "36973",
'dest_host' : "fw1",
'dest_port' : "941",
'protocol' : "tcp",
'product' : "VPN-1 & FireWall-1",
'inbound_int' : "PCnet1"})
def test_apache(self):
"""Test Apache normalization"""
# Test Common Log Format (CLF) "%h %l %u %t \"%r\" %>s %O"
self.aS("""127.0.0.1 - - [20/Jul/2009:00:29:39 +0300] "GET /index/helper/test HTTP/1.1" 200 889""",
{'program' : "apache",
'source_ip' : "127.0.0.1",
'request' : 'GET /index/helper/test HTTP/1.1',
'len' : "889",
'date' : datetime(2009, 7, 20, 0, 29, 39),
'body' : '127.0.0.1 - - [20/Jul/2009:00:29:39 +0300] "GET /index/helper/test HTTP/1.1" 200 889'})
# Test "combined" log format "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\""
self.aS('10.10.4.4 - - [04/Dec/2009:16:23:13 +0100] "GET /tulipe.core.persistent.persistent-module.html HTTP/1.1" 200 2937 "http://10.10.4.86/toc.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.3) Gecko/20090910 Ubuntu/9.04 (jaunty) Shiretoko/3.5.3"',
{'program' : "apache",
'source_ip' : "10.10.4.4",
'source_logname' : "-",
'user' : "-",
'date' : datetime(2009, 12, 4, 16, 23, 13),
'request' : 'GET /tulipe.core.persistent.persistent-module.html HTTP/1.1',
'status' : "200",
'len' : "2937",
'request_header_referer_contents' : "http://10.10.4.86/toc.html",
'useragent' : "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.3) Gecko/20090910 Ubuntu/9.04 (jaunty) Shiretoko/3.5.3",
'body' : '10.10.4.4 - - [04/Dec/2009:16:23:13 +0100] "GET /tulipe.core.persistent.persistent-module.html HTTP/1.1" 200 2937 "http://10.10.4.86/toc.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.3) Gecko/20090910 Ubuntu/9.04 (jaunty) Shiretoko/3.5.3"'})
# Test "vhost_combined" log format "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\""
#TODO: Update apache normalizer to handle this format.
def test_bind9(self):
"""Test Bind9 normalization"""
self.aS("Oct 22 01:27:16 pluto named: client 192.168.198.130#4532: bad zone transfer request: 'www.abc.com/IN': non-authoritative zone (NOTAUTH)",
{'event_id' : "zone_transfer_bad",
'zone' : "www.abc.com",
'source_ip' : '192.168.198.130',
'class' : 'IN',
'program' : 'named'})
self.aS("Oct 22 01:27:16 pluto named: general: notice: client 10.10.4.4#39583: query: tpf.qa.ifr.lan IN SOA +",
{'event_id' : "client_query",
'domain' : "tpf.qa.ifr.lan",
'category' : "general",
'severity' : "notice",
'class' : "IN",
'source_ip' : "10.10.4.4",
'program' : 'named'})
self.aS("Oct 22 01:27:16 pluto named: createfetch: 126.92.194.77.zen.spamhaus.org A",
{'event_id' : "fetch_request",
'domain' : "126.92.194.77.zen.spamhaus.org",
'program' : 'named'})
def test_symantec8(self):
"""Test Symantec version 8 normalization"""
self.aS("""200A13080122,23,2,8,TRAVEL00,SYSTEM,,,,,,,16777216,"Symantec AntiVirus Realtime Protection Loaded.",0,,0,,,,,0,,,,,,,,,,SAMPLE_COMPUTER,,,,Parent,GROUP,,8.0.93330""",
{"program" : "symantec",
"date" : datetime(2002, 11, 19, 8, 1, 34),
"category" : "Summary",
"local_host" : "TRAVEL00",
"domain_name" : "GROUP",
"event_logger_type" : "System",
"event_id" : "GL_EVENT_RTS_LOAD",
"eventblock_action" : "EB_LOG",
"group_id" : "0",
"operation_flags" : "0",
"parent" : "SAMPLE_COMPUTER",
"scan_id" : "0",
"server_group" : "Parent",
"user" : "SYSTEM",
"version" : "8.0.93330"})
# Need to find real symantec version 9 log lines
def test_symantec9(self):
"""Test Symantec version 9 normalization"""
self.aS("""200A13080122,23,2,8,TRAVEL00,SYSTEM,,,,,,,16777216,"Symantec AntiVirus Realtime Protection Loaded.",0,,0,,,,,0,,,,,,,,,,SAMPLE_COMPUTER,,,,Parent,GROUP,,9.0.93330,,,,,,,,,,,,,,,,,,,,""",
{"program" : "symantec",
"date" : datetime(2002, 11, 19, 8, 1, 34),
"category" : "Summary",
"local_host" : "TRAVEL00",
"domain_name" : "GROUP",
"event_logger_type" : "System",
"event_id" : "GL_EVENT_RTS_LOAD",
"eventblock_action" : "EB_LOG",
"group_id" : "0",
"operation_flags" : "0",
"parent" : "SAMPLE_COMPUTER",
"scan_id" : "0",
"server_group" : "Parent",
"user" : "SYSTEM",
"version" : "9.0.93330"})
def test_arkoonFAST360(self):
"""Test Arkoon FAST360 normalization"""
self.aS('AKLOG-id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IP gmtime=1077727137 ip_log_type=ENDCONN src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 intf_in=eth0 intf_out= pkt_len=78 nat=NO snat_addr=0 snat_port=0 dnat_addr=0 dnat_port=0 user="userName" pri=3 rule="myRule" action=DENY reason="Blocked by filter" description="dst addr received from Internet is private"',
{"program" : "arkoon",
"date" : datetime(2004, 02, 25, 16, 38, 57),
"event_id" : "IP",
"priority" : "3",
"local_host" : "myArkoon",
"user" : "userName",
"protocol": "udp",
"dest_ip" : "10.10.192.255",
"source_ip" : "10.10.192.61",
"reason" : "Blocked by filter",
"ip_log_type" : "ENDCONN",
"body" : 'id=firewall time="2004-02-25 17:38:57" fw=myArkoon aktype=IP gmtime=1077727137 ip_log_type=ENDCONN src=10.10.192.61 dst=10.10.192.255 proto="137/udp" protocol=17 port_src=137 port_dest=137 intf_in=eth0 intf_out= pkt_len=78 nat=NO snat_addr=0 snat_port=0 dnat_addr=0 dnat_port=0 user="userName" pri=3 rule="myRule" action=DENY reason="Blocked by filter" description="dst addr received from Internet is private"'})
# Assuming this kind of log with syslog like header is typically sent over the wire.
self.aS('<134>IP-Logs: AKLOG - id=firewall time="2010-10-04 10:38:37" gmtime=1286181517 fw=doberman.jurassic.ta aktype=IP ip_log_type=NEWCONN src=172.10.10.107 dst=204.13.8.181 proto="http" protocol=6 port_src=2619 port_dest=80 intf_in=eth7 intf_out=eth2 pkt_len=48 nat=HIDE snat_addr=10.10.10.199 snat_port=16176 dnat_addr=0 dnat_port=0 tcp_seq=1113958286 tcp_ack=0 tcp_flags="SYN" user="" vpn-src="" pri=6 rule="surf_normal" action=ACCEPT',
{'program': 'arkoon',
'event_id': 'IP',
'rule': 'surf_normal',
'ip_log_type': 'NEWCONN'})
# This one must not match the arkoonFAST360 parser
# Assuming this king of log does not exist
self.aS('<40>Dec 21 08:42:17 hosting arkoon: <134>IP-Logs: AKLOG - id=firewall time="2010-10-04 10:38:37" gmtime=1286181517 fw=doberman.jurassic.ta aktype=IP ip_log_type=NEWCONN src=172.10.10.107 dst=204.13.8.181 proto="http" protocol=6 port_src=2619 port_dest=80 intf_in=eth7 intf_out=eth2 pkt_len=48 nat=HIDE snat_addr=10.10.10.199 snat_port=16176 dnat_addr=0 dnat_port=0 tcp_seq=1113958286 tcp_ack=0 tcp_flags="SYN" user="" vpn-src="" pri=6 rule="surf_normal" action=ACCEPT',
{'program': 'arkoon'}, # program is set by syslog parser
('event_id', 'rule', 'ip_log_type'))
def test_MSExchange2007MTL(self):
"""Test Exchange 2007 message tracking log normalization"""
self.aS("""2010-04-19T12:29:07.390Z,10.10.14.73,WIN2K3DC,,WIN2K3DC,"MDB:ada3d2c3-6f32-45db-b1ee-a68dbcc86664, Mailbox:68cf09c1-1344-4639-b013-3c6f8a588504, Event:1440, MessageClass:IPM.Note, CreationTime:2010-04-19T12:28:51.312Z, ClientType:User",,STOREDRIVER,SUBMIT,,,,,,,,,Coucou !,user7@qa.ifr.lan,,""",
{'mdb': 'ada3d2c3-6f32-45db-b1ee-a68dbcc86664',
'source_host': 'WIN2K3DC',
'source_ip': '10.10.14.73',
'client_type': 'User',
'creation_time': 'Mon Apr 19 12:28:51 2010',
'date': datetime(2010, 4, 19, 12, 29, 7, 390000),
'event': '1440',
'event_id': 'SUBMIT',
'exchange_source': 'STOREDRIVER',
'mailbox': '68cf09c1-1344-4639-b013-3c6f8a588504',
'message_class': 'IPM.Note',
'message_id': 'C6539E897AEDFA469FE34D029FB708D43495@win2k3dc.qa.ifr.lan',
'message_subject': 'Coucou !',
'program': 'MS Exchange 2007 Message Tracking',
'dest_host': 'WIN2K3DC'})
def test_S3(self):
"""Test Amazon S3 bucket log normalization"""
self.aS("""DEADBEEF testbucket [19/Jul/2011:13:17:11 +0000] 10.194.22.16 FACEDEAD CAFEDECA REST.GET.ACL - "GET /?acl HTTP/1.1" 200 - 951 - 397 - "-" "Jakarta Commons-HttpClient/3.0" -""",
{'source_ip': '10.194.22.16',
'http_method': 'GET',
'protocol': 'HTTP/1.1',
'status': '200',
'user': 'DEADBEEF',
'method': 'REST.GET.ACL',
'program': 's3'})
def test_Snare(self):
"""Test Snare for Windows log normalization"""
self.aS(unicode("""<13> Aug 31 15:46:47 a-zA-Z0-9_ MSWinEventLog 1 System 287 ven. août 26 16:45:45 201 4 Virtual Disk Service Constantin N/A Information a-zA-Z0-9_ None Le service s’est arrêté. 119 """, 'utf8'),
{'snare_event_log_type': 'MSWinEventLog',
'criticality': '1',
'event_log_source_name': 'System',
'snare_event_counter': '287',
'event_id': '4',
'event_log_expanded_source_name': 'Virtual Disk Service',
'user': 'Constantin',
'sid_used': 'N/A',
'event_type': 'Information',
'source_host': 'a-zA-Z0-9_',
'audit_event_category': 'None',
'program' : 'EventLog',
'body': unicode('Le service s’est arrêté. 119 ', 'utf8')})
self.aS(unicode("""<13> Aug 31 15:46:47 a-zA-Z0-9_ MSWinEventLog 0 Security 284 ven. août 26 16:42:01 201 4689 Microsoft-Windows-Security-Auditing A-ZA-Z0-9_\\clo N/A Success Audit a-zA-Z0-9_ Fin du processus Un processus est terminé. Sujet : ID de sécurité : S-1-5-21-2423214773-420032381-3839276281-1000 Nom du compte : clo Domaine du compte : A-ZA-Z0-9_ ID d’ouverture de session : 0x21211 Informations sur le processus : ID du processus : 0xb4c Nom du processus : C:\\Windows\\System32\\taskeng.exe État de fin : 0x0 138 """, 'utf8'),
{'snare_event_log_type': 'MSWinEventLog',
'criticality': '0',
'event_log_source_name': 'Security',
'snare_event_counter': '284',
'event_id': '4689',
'event_log_expanded_source_name': 'Microsoft-Windows-Security-Auditing',
'user': 'A-ZA-Z0-9_\\clo',
'sid_used': 'N/A',
'event_type': 'Success Audit',
'source_host': 'a-zA-Z0-9_',
'audit_event_category': 'Fin du processus',
'program' : "EventLog",
'body': unicode('Un processus est terminé. Sujet : ID de sécurité : S-1-5-21-2423214773-420032381-3839276281-1000 Nom du compte : clo Domaine du compte : A-ZA-Z0-9_ ID d’ouverture de session : 0x21211 Informations sur le processus : ID du processus : 0xb4c Nom du processus : C:\\Windows\\System32\\taskeng.exe État de fin : 0x0 138 ', 'utf8')})
def test_vmwareESX4_ESXi4(self):
"""Test VMware ESX 4.x and VMware ESXi 4.x log normalization"""
self.aS("""[2011-09-05 16:06:30.016 F4CD1B90 verbose 'Locale' opID=996867CC-000002A6] Default resource used for 'host.SystemIdentificationInfo.IdentifierType.ServiceTag.summary' expected in module 'enum'.""",
{'date': datetime(2011, 9, 5, 16, 6, 30),
'numeric': 'F4CD1B90',
'level': 'verbose',
'alpha': 'Locale',
'body': 'Default resource used for \'host.SystemIdentificationInfo.IdentifierType.ServiceTag.summary\' expected in module \'enum\'.'})
self.aS("""sysboot: Executing 'kill -TERM 314'""",
{'body': 'Executing \'kill -TERM 314\''})
# def test_mysql(self):
# """Test mysql log normalization"""
# self.aS("""110923 11:04:58 36 Query show databases""",
# {'date': datetime(2011, 9, 23, 11, 4, 58),
# 'id': '36',
# 'type': 'Query',
# 'event': 'show databases'})
# self.aS("""110923 10:09:11 [Note] Plugin 'FEDERATED' is disabled.""",
# {'date': datetime(2011, 9, 23, 10, 9, 11),
# 'component': 'Note',
# 'event': 'Plugin \'FEDERATED\' is disabled.'})
def test_IIS(self):
"""Test IIS log normalization"""
self.aS("""172.16.255.255, anonymous, 03/20/01, 23:58:11, MSFTPSVC, SALES1, 172.16.255.255, 60, 275, 0, 0, 0, PASS, /Intro.htm, -,""",
{'source_ip': '172.16.255.255',
'user': 'anonymous',
'date': datetime(2001, 3, 20, 23, 58, 11),
'service': 'MSFTPSVC',
'dest_host': 'SALES1',
'dest_ip': '172.16.255.255',
'time_taken': 0.06,
'sent_bytes_number': '275',
'returned_bytes_number': '0',
'status': '0',
'windows_status_code': '0',
'method': 'PASS',
'url_path': '/Intro.htm',
'script_parameters': '-'})
self.aS("""2011-09-26 13:57:48 W3SVC1 127.0.0.1 GET /tapage.asp - 80 - 127.0.0.1 Mozilla/4.0+(compatible;MSIE+6.0;+windows+NT5.2;+SV1;+.NET+CLR+1.1.4322) 404 0 2""",
{'date': datetime(2011, 9, 26, 13, 57, 48),
'service': 'W3SVC1',
'dest_ip': '127.0.0.1',
'method': 'GET',
'url_path': '/tapage.asp',
'query': '-',
'port': '80',
'user': '-',
'source_ip': '127.0.0.1',
'useragent': 'Mozilla/4.0+(compatible;MSIE+6.0;+windows+NT5.2;+SV1;+.NET+CLR+1.1.4322)',
'status': '404',
'substatus': '0',
'win_status': '2'})
def test_fail2ban(self):
"""Test fail2ban ssh banishment logs"""
self.aS("""2011-09-25 05:09:02,371 fail2ban.filter : INFO Log rotation detected for /var/log/auth.log""",
{'program' : 'fail2ban',
'component' : 'filter',
'body' : "Log rotation detected for /var/log/auth.log",
'date' : datetime(2011,9,25,5,9,2).replace(microsecond = 371000)})
self.aS("""2011-09-25 21:59:24,304 fail2ban.actions: WARNING [ssh] Ban 219.117.199.6""",
{'program' : 'fail2ban',
'component' : 'actions',
'action' : "Ban",
'protocol' : "ssh",
'source_ip' : "219.117.199.6",
'date' : datetime(2011,9,25,21,59,24).replace(microsecond = 304000)})
def test_bitdefender(self):
"""Test bitdefender spam.log (Mail Server for UNIX version)"""
self.aS('10/20/2011 07:24:26 BDMAILD SPAM: sender: marcelo@nitex.com.br, recipients: re@corp.com, sender IP: 127.0.0.1, subject: "Lago para pesca, piscina, charrete, Hotel Fazenda", score: 1000, stamp: " v1, build 2.10.1.12405, blacklisted, total: 1000(750)", agent: Smtp Proxy 3.1.3, action: drop (move-to-quarantine;drop), header recipients: ( "cafe almoço e janta incluso" ), headers: ( "Received: from localhost [127.0.0.1] by BitDefender SMTP Proxy on localhost [127.0.0.1] for localhost [127.0.0.1]; Thu, 20 Oct 2011 07:24:26 +0200 (CEST)" "Received: from paris.office.corp.com (go.corp.lan [10.10.1.254]) by as-bd-64.ifr.lan (Postfix) with ESMTP id 4D23D1C7 for ; Thu, 20 Oct 2011 07:24:26 +0200 (CEST)" "Received: from rj50ssp.nitex.com.br (rj154ssp.nitex.com.br [177.47.99.154]) by paris.office.corp.com (Postfix) with ESMTP id 28C0D6A4891 for ; Thu, 20 Oct 2011 07:17:59 +0200 (CEST)" "Received: from rj154ssp.nitex.com.br (ced-sp.tuavitoria.com.br [177.47.99.13]) by rj50ssp.nitex.com.br (Postfix) with ESMTP id 9B867132C9E; Wed, 19 Oct 2011 22:29:20 -0200 (BRST)" ), group: "Default"',
{'message_sender' : 'marcelo@nitex.com.br',
'program' : 'bitdefender',
'action' : 'drop',
'message_recipients' : 're@corp.com',
'date' : datetime(2011,10,20,07,24,26),
'reason' : 'blacklisted'})
self.aS('10/24/2011 04:31:39 BDSCAND ERROR: failed to initialize the AV core',
{'program' : 'bitdefender',
'body' : 'failed to initialize the AV core',
'date' : datetime(2011,10,24,04,31,39)})
def test_simple_wabauth(self):
"""Test syslog logs"""
self.aS("Dec 20 17:20:22 wab2 WAB(CORE)[18190]: type='session closed' username='admin' secondary='root@debian32' client_ip='10.10.4.25' src_protocol='SFTP_SESSION' dst_protocol='SFTP_SESSION' message=''",
{ 'account': 'root',
'client_ip': '10.10.4.25',
'date': datetime(2011, 12, 20, 17, 20, 22),
'dest_proto': 'SFTP_SESSION',
'message': '',
'pid': '18190',
'program': 'WAB(CORE)',
'resource': 'debian32',
'source': 'wab2',
'source_proto': 'SFTP_SESSION',
'type': 'session closed',
'username': 'admin'})
self.aS("Dec 20 17:19:35 wab2 WAB(CORE)[18190]: type='primary_authentication' timestamp='2011-12-20 17:19:35.621952' username='admin' client_ip='10.10.4.25' diagnostic='SUCCESS'",
{'client_ip': '10.10.4.25',
'date': datetime(2011, 12, 20, 17, 19, 35),
'diagnostic': 'SUCCESS',
'pid': '18190',
'program': 'WAB(CORE)',
'source': 'wab2',
'type': 'primary_authentication',
'username': 'admin'})
self.aS("Dec 20 17:19:35 wab2 WAB(CORE)[18190]: type='session opened' username='admin' secondary='root@debian32' client_ip='10.10.4.25' src_protocol='SFTP_SESSION' dst_protocol='SFTP_SESSION' message=''",
{ 'account': 'root',
'client_ip': '10.10.4.25',
'date': datetime(2011, 12, 20, 17, 19, 35),
'dest_proto': 'SFTP_SESSION',
'message': '',
'pid': '18190',
'program': 'WAB(CORE)',
'resource': 'debian32',
'source': 'wab2',
'source_proto': 'SFTP_SESSION',
'type': 'session opened',
'username': 'admin'})
def test_xferlog(self):
"""Testing xferlog formatted logs"""
self.aS("Thu Sep 2 09:52:00 2004 50 192.168.20.10 896242 /home/test/file1.tgz b _ o r suporte ftp 0 * c ",
{'transfer_time' : '50',
'source_ip' : '192.168.20.10',
'len' : '896242',
'filename' : '/home/test/file1.tgz',
'transfer_type_code' : 'b',
'special_action_flag' : '_',
'direction_code' : 'o',
'access_mode_code' : 'r',
'completion_status_code' : 'c',
'authentication_method_code' : '0',
'transfer_type' : 'binary',
'special_action' : 'none',
'direction' : 'outgoing',
'access_mode' : 'real',
'completion_status' : 'complete',
'authentication_method' : 'none',
'user' : 'suporte',
'service_name' : 'ftp',
'authenticated_user_id' : '*',
'program' : 'ftpd',
'date' : datetime(2004,9,2,9,52),})
self.aS("Tue Dec 27 11:24:23 2011 1 127.0.0.1 711074 /home/mhu/Documents/Brooks,_Max_-_World_War_Z.mobi b _ o r mhu ftp 0 * c",
{'transfer_time' : '1',
'source_ip' : '127.0.0.1',
'len' : '711074',
'filename' : '/home/mhu/Documents/Brooks,_Max_-_World_War_Z.mobi',
'transfer_type_code' : 'b',
'special_action_flag' : '_',
'direction_code' : 'o',
'access_mode_code' : 'r',
'completion_status_code' : 'c',
'authentication_method_code' : '0',
'transfer_type' : 'binary',
'special_action' : 'none',
'direction' : 'outgoing',
'access_mode' : 'real',
'completion_status' : 'complete',
'authentication_method' : 'none',
'user' : 'mhu',
'service_name' : 'ftp',
'authenticated_user_id' : '*',
'program' : 'ftpd',
'date' : datetime(2011,12,27,11,24,23),})
def test_dansguardian(self):
"""Testing dansguardian logs"""
self.aS("2011.12.13 10:41:28 10.10.42.23 10.10.42.23 http://safebrowsing.clients.google.com/safebrowsing/downloads?client=Iceweasel&appver=3.5.16&pver=2.2&wrkey=AKEgNityGqylPYNyNETvnRjDjo4mIKcwv7f-8UCJaKERjXG6cXrikbgdA0AG6J8A6zng73h9U1GoE7P5ZPn0dDLmD_t3q1csCw== *EXCEPTION* Site interdit trouv&ecute;. POST 491 0 2 200 - limited_access -",
{'program' : 'dansguardian',
'user' : '10.10.42.23',
'source_ip' : '10.10.42.23',
'url' : 'http://safebrowsing.clients.google.com/safebrowsing/downloads?client=Iceweasel&appver=3.5.16&pver=2.2&wrkey=AKEgNityGqylPYNyNETvnRjDjo4mIKcwv7f-8UCJaKERjXG6cXrikbgdA0AG6J8A6zng73h9U1GoE7P5ZPn0dDLmD_t3q1csCw==',
'actions' : "*EXCEPTION*",
'action' : 'EXCEPTION',
'reason' : "Site interdit trouv&ecute;.",
"method" : "POST",
"len" : "491",
"naughtiness" : "0",
"filter_group_number" : "2",
"status" : "200",
"mime_type" : "-",
"filter_group_name" : "limited_access",
'date' : datetime(2011,12,13,10,41,28),})
def test_deny_event(self):
"""Testing denyAll event logs"""
self.aS("""224,2011-01-24 17:44:46.061903,2011-01-24 17:44:46.061903,,,192.168.219.10,127.0.0.1,,2,1,4,0,"Session opened (read-write), Forwarded for 192.168.219.1.",superadmin,gui,,{403ec510-27d9-11e0-bbe7-000c298895c5}Session,,,,,,,,,,,,,,,,,,,,""",
{'alert_id': '0',
'alert_subtype': 'Access',
'alert_subtype_id': '1',
'alert_type': 'System',
'alert_type_id': '2',
'alert_value': 'Session opened (read-write), Forwarded for 192.168.219.1.',
'body': '224,2011-01-24 17:44:46.061903,2011-01-24 17:44:46.061903,,,192.168.219.10,127.0.0.1,,2,1,4,0,"Session opened (read-write), Forwarded for 192.168.219.1.",superadmin,gui,,{403ec510-27d9-11e0-bbe7-000c298895c5}Session,,,,,,,,,,,,,,,,,,,,',
'date': datetime(2011, 1, 24, 17, 44, 46),
'end_date': '2011-01-24 17:44:46.061903',
'event': 'User successful login',
'event_uid': '224',
'interface': 'gui',
'ip_device': '192.168.219.10',
'parameter_changed': '{403ec510-27d9-11e0-bbe7-000c298895c5}Session',
'raw': '224,2011-01-24 17:44:46.061903,2011-01-24 17:44:46.061903,,,192.168.219.10,127.0.0.1,,2,1,4,0,"Session opened (read-write), Forwarded for 192.168.219.1.",superadmin,gui,,{403ec510-27d9-11e0-bbe7-000c298895c5}Session,,,,,,,,,,,,,,,,,,,,',
'severity': 'Warn',
'severity_code': '4',
'source_ip': '127.0.0.1',
'user': 'superadmin'})
self.aS("""1,2011-01-20 15:09:38.130965,2011-01-20 15:09:38.130965,,,::1,,,2,2,5,0,rWeb started.,,,,,,,,,,,,,,,,,,,,,,,,""",
{'alert_id': '0',
'alert_subtype': 'Device Operations',
'alert_subtype_id': '2',
'alert_type': 'System',
'alert_type_id': '2',
'alert_value': 'rWeb started.',
'body': '1,2011-01-20 15:09:38.130965,2011-01-20 15:09:38.130965,,,::1,,,2,2,5,0,rWeb started.,,,,,,,,,,,,,,,,,,,,,,,,',
'date': datetime(2011, 1, 20, 15, 9, 38),
'end_date': '2011-01-20 15:09:38.130965',
'event': 'rWeb started',
'event_uid': '1',
'ip_device': '::1',
'raw': '1,2011-01-20 15:09:38.130965,2011-01-20 15:09:38.130965,,,::1,,,2,2,5,0,rWeb started.,,,,,,,,,,,,,,,,,,,,,,,,',
'severity': 'Notice',
'severity_code': '5'} )
def test_cisco_asa(self):
"""Testing CISCO ASA logs"""
self.aS("""<168>Mar 05 2010 11:06:12 ciscoasa : %ASA-6-305011: Built dynamic TCP translation from 14net:14.36.103.220/300 to 172net:172.18.254.146/55""",
{'program': 'cisco-asa',
'severity_code': '6',
'event_id': '305011',
'date': datetime(2010, 3, 5, 11, 6, 12),
'taxonomy': 'firewall',
'outbound_int': '172net',
'dest_port': '55'})
self.aS("""<168>Jul 02 2006 07:33:45 ciscoasa : %ASA-6-302013: Built outbound TCP connection 8300517 for outside:64.156.4.191/110 (64.156.4.191/110) to inside:192.168.8.12/3109 (xxx.xxx.185.142/11310)""",
{'program': 'cisco-asa',
'severity_code': '6',
'event_id': '302013',
'date': datetime(2006, 7, 2, 7, 33, 45),
'taxonomy': 'firewall',
'outbound_int': 'inside',
'dest_ip': '192.168.8.12'})
if __name__ == "__main__":
unittest.main()
pylogsparser-0.4/tests/test_commonElements.py 0000644 0001750 0001750 00000016264 11700571003 017750 0 ustar fbo fbo # -*- python -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix Inc.
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
import os
import unittest
from datetime import datetime, timedelta
from logsparser.normalizer import get_generic_tagTypes
from logsparser.normalizer import get_generic_callBacks
def get_sensible_year(*args):
"""args is a list of ordered date elements, from month and day (both
mandatory) to eventual second. The function gives the most sensible
year for that set of values, so that the date is not set in the future."""
year = int(datetime.now().year)
d = datetime(year, *args)
if d > datetime.now():
return year - 1
return year
def generic_time_callback_test(instance, cb):
"""Testing time formatting callbacks. This is boilerplate code."""
# so far only time related callbacks were written. If it changes, list
# here non related functions to skip in this test.
instance.assertTrue(cb in instance.cb.keys())
DATES_TO_TEST = [ datetime.utcnow() + timedelta(-1),
datetime.utcnow() + timedelta(-180),
datetime.utcnow() + timedelta(1), # will always be considered as in the future unless you're testing on new year's eve...
]
# The pattern translation list. Order is important !
translations = [ ("YYYY", "%Y"),
("YY" , "%y"),
("DDD" , "%a"), # localized day
("DD" , "%d"), # day with eventual leading 0
("dd" , "%d"),
("MMM" , "%b"), # localized month
("MM" , "%m"), # month number with eventual leading 0
("hh" , "%H"),
("mm" , "%M"),
("ss" , "%S") ]
pattern = cb
for old, new in translations:
pattern = pattern.replace(old, new)
# special cases
if pattern == "ISO8601":
pattern = "%Y-%m-%dT%H:%M:%SZ"
for d in DATES_TO_TEST:
if pattern == "EPOCH":
value = d.strftime('%s') + ".%i" % (d.microsecond/1000)
expected_result = datetime.utcfromtimestamp(float(value))
else:
value = d.strftime(pattern)
expected_result = datetime.strptime(value, pattern)
# Deal with time formats that don't define a year explicitly
if "%y" not in pattern.lower():
expected_year = get_sensible_year(*expected_result.timetuple()[1:-3])
expected_result = expected_result.replace(year = expected_year)
log = {}
instance.cb[cb](value, log)
instance.assertTrue("date" in log.keys())
instance.assertEqual(log['date'], expected_result)
class TestGenericLibrary(unittest.TestCase):
"""Unit testing for the generic libraries"""
normalizer_path = os.environ['NORMALIZERS_PATH']
tagTypes = get_generic_tagTypes(os.path.join(normalizer_path,
'common_tagTypes.xml'))
cb = get_generic_callBacks(os.path.join(normalizer_path,
'common_callBacks.xml'))
def test_000_availability(self):
"""Testing libraries' availability"""
self.assertTrue( self.tagTypes != {} )
self.assertTrue( self.cb != {} )
def test_010_test_tagTypes(self):
"""Testing tagTypes' accuracy"""
self.assertTrue(self.tagTypes['EpochTime'].compiled_regexp.match('12934824.134'))
self.assertTrue(self.tagTypes['EpochTime'].compiled_regexp.match('12934824'))
self.assertTrue(self.tagTypes['syslogDate'].compiled_regexp.match('Jan 23 10:23:45'))
self.assertTrue(self.tagTypes['syslogDate'].compiled_regexp.match('Oct 6 23:05:10'))
self.assertTrue(self.tagTypes['URL'].compiled_regexp.match('http://www.wallix.org'))
self.assertTrue(self.tagTypes['URL'].compiled_regexp.match('https://mysecuresite.com/?myparam=myvalue&myotherparam=myothervalue'))
self.assertTrue(self.tagTypes['Email'].compiled_regexp.match('mhu@wallix.com'))
self.assertTrue(self.tagTypes['Email'].compiled_regexp.match('matthieu.huin@wallix.com'))
self.assertTrue(self.tagTypes['Email'].compiled_regexp.match('John-Fitzgerald.Willis@super-duper.institution.withlotsof.subdomains.org'))
self.assertTrue(self.tagTypes['IP'].compiled_regexp.match('192.168.1.1'))
self.assertTrue(self.tagTypes['IP'].compiled_regexp.match('255.255.255.0'))
# shouldn't match ...
self.assertTrue(self.tagTypes['IP'].compiled_regexp.match('999.888.777.666'))
self.assertTrue(self.tagTypes['MACAddress'].compiled_regexp.match('0e:88:6a:4b:00:ff'))
self.assertTrue(self.tagTypes['ZuluTime'].compiled_regexp.match('2012-12-21'))
self.assertTrue(self.tagTypes['ZuluTime'].compiled_regexp.match('2012-12-21T12:34:56.99'))
# I wish there was a way to create these tests on the fly ...
def test_020_test_time_callback(self):
"""Testing callback MM/dd/YYYY hh:mm:ss"""
generic_time_callback_test(self, "MM/dd/YYYY hh:mm:ss")
def test_030_test_time_callback(self):
"""Testing callback dd/MMM/YYYY:hh:mm:ss"""
generic_time_callback_test(self, "dd/MMM/YYYY:hh:mm:ss")
def test_040_test_time_callback(self):
"""Testing callback MMM dd hh:mm:ss"""
generic_time_callback_test(self, "MMM dd hh:mm:ss")
def test_050_test_time_callback(self):
"""Testing callback DDD MMM dd hh:mm:ss YYYY"""
generic_time_callback_test(self, "DDD MMM dd hh:mm:ss YYYY")
def test_060_test_time_callback(self):
"""Testing callback YYYY-MM-DD hh:mm:ss"""
generic_time_callback_test(self, "YYYY-MM-DD hh:mm:ss")
def test_070_test_time_callback(self):
"""Testing callback MM/DD/YY, hh:mm:ss"""
generic_time_callback_test(self, "MM/DD/YY, hh:mm:ss")
def test_070_test_time_callback(self):
"""Testing callback YYMMDD hh:mm:ss"""
generic_time_callback_test(self, "YYMMDD hh:mm:ss")
def test_080_test_time_callback(self):
"""Testing callback ISO8601"""
generic_time_callback_test(self, "ISO8601")
def test_090_test_time_callback(self):
"""Testing callback EPOCH"""
generic_time_callback_test(self, "EPOCH")
def test_100_test_time_callback(self):
"""Testing callback dd-MMM-YYYY hh:mm:ss"""
generic_time_callback_test(self, "dd-MMM-YYYY hh:mm:ss")
if __name__ == "__main__":
unittest.main()
pylogsparser-0.4/setup.py 0000644 0001750 0001750 00000004525 11715705366 013740 0 ustar fbo fbo # -*- python -*-
# pylogsparser - Logs parsers python library
#
# Copyright (C) 2011 Wallix SARL
#
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by the
# Free Software Foundation; either version 2.1 of the License, or (at your
# option) any later version.
#
# This library is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
# details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this library; if not, write to the Free Software Foundation, Inc.,
# 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
import os
import glob
from distutils.core import setup
# Utility function to read the README file.
# Used for the long_description. It's nice, because now 1) we have a top level
# README file and 2) it's easier to type in the README file than to put a raw
# string in below ...
def read(fname):
return open(os.path.join(os.path.dirname(__file__), fname)).read()
data = glob.glob('normalizers/*.xml')
data.extend(glob.glob('normalizers/*.template'))
data.extend(glob.glob('normalizers/*.dtd'))
fr_trans = glob.glob('logsparser/i18n/fr_FR/LC_MESSAGES/normalizer.*')
setup(
name = "pylogsparser",
version = "0.4",
author = "Wallix",
author_email = "opensource@wallix.org",
description = ("A log parser library packaged with a set of ready to use parsers (DHCPd, Squid, Apache, ...)"),
license = "LGPL",
keywords = "log parser xml library python",
url = "http://www.wallix.org/pylogsparser-project/",
package_dir={'logsparser.tests':'tests'},
packages=['logsparser', 'logsparser.tests', 'logsparser.extras'],
data_files=[('share/logsparser/normalizers', data),
('share/logsparser/i18n/fr_FR/LC_MESSAGES/', fr_trans),],
requires=['lxml', 'pytz'],
long_description=read('README.rst'),
# http://pypi.python.org/pypi?:action=list_classifiers
classifiers=[
"Development Status :: 4 - Beta",
"Topic :: System :: Logging",
"Topic :: Software Development :: Libraries",
"License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)",
],
)
pylogsparser-0.4/logsparser/ 0000755 0001750 0001750 00000000000 11715707344 014377 5 ustar fbo fbo pylogsparser-0.4/logsparser/__init__.py 0000644 0001750 0001750 00000000000 11627706151 016473 0 ustar fbo fbo pylogsparser-0.4/logsparser/i18n/ 0000755 0001750 0001750 00000000000 11715707344 015156 5 ustar fbo fbo pylogsparser-0.4/logsparser/i18n/fr_FR/ 0000755 0001750 0001750 00000000000 11715707344 016154 5 ustar fbo fbo pylogsparser-0.4/logsparser/i18n/fr_FR/LC_MESSAGES/ 0000755 0001750 0001750 00000000000 11715707344 017741 5 ustar fbo fbo pylogsparser-0.4/logsparser/i18n/fr_FR/LC_MESSAGES/normalizer.po 0000644 0001750 0001750 00000004127 11705765631 022471 0 ustar fbo fbo # SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR , YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2012-01-18 12:36+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME \n"
"Language-Team: LANGUAGE \n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"
#: logsparser/normalizer.py:738
#, python-format
msgid ""
"%(title)s\n"
"\n"
"**Written by**\n"
"\n"
"%(authors)s\n"
"\n"
"Description\n"
":::::::::::\n"
"\n"
"%(description)s %(taxonomy)s\n"
"\n"
"This normalizer can parse logs of the following structure(s):\n"
"\n"
"%(patterns)s\n"
"\n"
"Examples\n"
"::::::::\n"
"\n"
"%(examples)s"
msgstr ""
"%(title)s\n"
"\n"
"**Auteur(s)**\n"
"\n"
"%(authors)s\n"
"\n"
"Description\n"
":::::::::::\n"
"\n"
"%(description)s\n %(taxonomy)s\n"
"\n"
"Ce normaliseur reconnaît les logs structurés de la façon suivante:\n"
"\n"
"%(patterns)s\n"
"\n"
"Exemples\n"
"::::::::\n"
"\n"
"%(examples)s"
#: logsparser/normalizer.py:762 logsparser/normalizer.py:773
msgid "undocumented"
msgstr "non documenté"
#: logsparser/normalizer.py:766
#, python-format
msgid "This normalizer belongs to the category : *%s*"
msgstr "Ce normaliseur appartient à la catégorie : *%s*"
#: logsparser/normalizer.py:771
msgid ""
", where\n"
"\n"
msgstr "où\n"
"\n"
#: logsparser/normalizer.py:773
#, python-format
msgid " * **%s** is %s "
msgstr " * **%s** est %s "
#: logsparser/normalizer.py:775
#, python-format
msgid "(normalized as *%s*)"
msgstr "(tag associé : *%s*)"
#: logsparser/normalizer.py:778
msgid ""
"\n"
" Additionally, The following tags are automatically set:\n"
"\n"
msgstr "\n"
" Les tags additionnels suivants sont définis automatiquement :\n"
"\n"
#: logsparser/normalizer.py:788
#, python-format
msgid ""
"* *%s*, normalized as\n"
"\n"
msgstr ""
"* *%s*, dont les tags suivants sont extraits:\n"
"\n"
pylogsparser-0.4/logsparser/i18n/fr_FR/LC_MESSAGES/normalizer.mo 0000644 0001750 0001750 00000002552 11705765631 022466 0 ustar fbo fbo d
<