pax_global_header00006660000000000000000000000064137234664440014527gustar00rootroot0000000000000052 comment=e8db1ee8f6b62790ebc8081e464dcedd7c961766 stenographer-1.0.1/000077500000000000000000000000001372346644400142275ustar00rootroot00000000000000stenographer-1.0.1/.travis.yml000066400000000000000000000006371372346644400163460ustar00rootroot00000000000000language: go sudo: required dist: trusty before_install: - sudo apt-get update -qq - sudo apt-get install -y libaio-dev - sudo apt-get install -y libleveldb-dev - sudo apt-get install -y libsnappy-dev - sudo apt-get install -y g++ - sudo apt-get install -y libcap2-bin - sudo apt-get install -y libseccomp-dev - sudo apt-get install -y jq script: - go test ./... - bash integration_test/test.sh stenographer-1.0.1/CONTRIBUTING.md000066400000000000000000000026721372346644400164670ustar00rootroot00000000000000Contributing To Stenographer ============================ Want to contribute? Great! First, read this page (including the small print at the end). ### Before you contribute ### Before we can use your code, you must sign the [Google Individual Contributor License Agreement](https://developers.google.com/open-source/cla/individual?csw=1) (CLA), which you can do online. The CLA is necessary mainly because you own the copyright to your changes, even after your contribution becomes part of our codebase, so we need your permission to use and distribute your code. We also need to be sure of various other things—for instance that you'll tell us if you know that your code infringes on other people's patents. You don't have to sign the CLA until after you've submitted your code for review and a member has approved it, but you must do it before we can put your code into our codebase. Before you start working on a larger contribution, you should get in touch with us first through the issue tracker with your idea so that we can help out and possibly guide you. Coordinating up front makes it much easier to avoid frustration later on. ### Code reviews ### All submissions, including submissions by project members, require review. We use Github pull requests for this purpose. ### The small print ### Contributions made by corporations are covered by a different agreement than the one above, the Software Grant and Corporate Contributor License Agreement. stenographer-1.0.1/DESIGN.md000066400000000000000000000513711372346644400155310ustar00rootroot00000000000000Stenographer/Stenotype Design ============================= Introduction ------------ This document is meant to give an overview of the design of stenographer and stenotype at a medium/high level. For low-level stuff, look at the code :). The architecture described in this document has changed relatively little over the course of the project, and we doubt it will change much in the future. High-Level Design ----------------- Stenographer consists of a `stenographer` server, which serves user requests and manages disk, and which runs a `stenotype` child process. `stenotype` sniffs packet data and writes it to disk, communicating with `stenographer` simply by un-hiding files when they're read for consumption. The user scripts `stenocurl` and `stenoread` provide simple wrappers around `curl`, which allow analysts to request packet data from the `stenographer` server simply and easily. Detailed Design --------------- Stenographer is actually a few separate processes. ### Stenographer ### Stenographer is a long-running server, the binary that you start up if you want to "run stenographer" on your system. It manages the `stenotype` binary as a child process, watches disk usage and cleans up old files, and serves data to analysts based on their queries. #### Running Stenotype #### First off, stenographer is in charge of making sure that `stenotype` (discussed momentarily) starts and keeps running. It starts stenotype as a subprocess, watching for failures and restarting as necessary. It also watches stenotype's output (the files it creates) and may kill/restart stenotype itself if it feels it is misbehaving or not generating files fast enough. #### Managing Disk(s) #### Stenographer watches the disks that stenotype uses and tries to keep them tidy and usable. This includes deleting old files when disk space decreases below a threshold, and deleting old temporary files that stenotype creates, if stenotype crashes before it can clean up after itself. Stenographer handles disk management in two ways. First, it runs checks whenever it starts up a new stenotype instance to make sure files from an old, possibly crashed instance are no longer around and causing issues. Secondly, it periodically checks disk state for out-of-disk issues (currently every 15 seconds). During that periodic check, it also looks for new files stenotype may have generated that it can use to serve analyst requests (described momentarily). #### Serving Data #### Stenographer is also in charge of serving any analyst requests for packet data. It watches the data generated by stenotype, and when analysts request packets it looks up their requests in the generated data and returns them. Stenographer provides data to analysts over TLS. Queries are POST'd to the /query HTTP handler, and responses are streamed back as PCAP files (MIME type application/octet-stream). Currently, stenographer only binds to localhost, so it doesn't accept remote user requests. #### Access Control #### Access to the server is controlled with client certificates. On install, a script, `stenokeys.sh`, is run to generate a CA certificate and use it to create/sign a client and server certificate. The client and server authenticate each other on every request using the CA certificate as a source of truth. POSIX permissions are used locally to control access to the certs... the `stenographer` user which runs steno has read access to the server key (`steno:root -r--------`). The `stenographer` group as read access to the client key (`root:steno ----r-----`). Key usage extensions specify that the server key must be used as a TLS server, and the client key must be used as a TLS client. Due to the file permissions mentioned above, giving steno access to a local user simply requires adding that user to the local `stenographer` group, thus giving them access to `client_key.pem`. Once keys are created on install, they're currently NEVER REVOKED. Thus, if someone gets access to a client cert, they'll have access to the server ad infinitum. Should you have problems with a key being released, the current best way to handle this is by deleting all data in the `/etc/stenographer/certs` directory and rerunning `stenokeys.sh` to generate an entirely new set of keys rooted to a new CA. `stenokeys.sh` will not modify keys/certs that already exist in `/etc/stenographer/certs`. Thus, if you have more complex topologies, you can overwrite these values and they'll happily be used by Stenographer. If, for example, you already have a CA in your organization, you can copy its cert into the `ca_cert.pem` file, then create `{client,server}_{key,cert}.pem` files rooted in that CA and copy them in. This also allows folks to use a single CA cert over multiple stenographer instances, allowing a single client cert to access multiple servers over the network. ### Stenotype ### Stenotype's sole purpose is to read packet data off the wire, index it, and write it to disk. It uses a multi-threaded architecture, while trying to limit context switching by having most processing on a single core stay within a single thread. #### Packet Sniffing/Writing #### Stenotype tries to be as performant as possible by allowing the kernel to do the vast majority of the work. It uses AF_PACKET, which asks the kernel to place packets into blocks in a shared memory region, then notify stenotype when blocks are available. After indexing the packets in each block, it passes the block directly back to the kernel as an O_DIRECT asynchronous write operation. Besides indexing, then, stenotype's main job is to wait for the kernel to put packets in a memory region, then immediately ask the kernel to take that region back and write it. An important benefit of this design is that packets are never copied out of the kernel's shared memory space. The kernel writes them from the NIC to shared memory, then the kernel uses that same shared memory for O_DIRECT writes to disk. The packets transit the bus twice and are never copied from RAM to RAM. #### Packet File Format #### As detailed above, the "file format" used by stenotype is actually to directly dump data as it's presented by AF_PACKET. Thus, data is written as blocks, with each block containing a small header followed by a linked list of packets. Blocks are large (1M), and are dumped regularly (every 10s), so there's a good chance that for slow networks we use far more disk than we need. However, as network speed increases past 1M/minute/thread, this format becomes quite efficient. There will always be overhead, however. Stenotype guarantees that a packet file will not exceed 4GB, by rotating files if they reach that size. It also rotates files older than 1 minute. Files are named for the microsecond timestamp they were created at. While a file is being written, it will be hidden (.1422693160230282). When rotating, the file will be renamed to no longer be hidden (.1422693160230282 -> 1422693160230282). This rename only occurs after all data has been successfully flushed to disk, so external processes which see this rename happen (like stenographer) can immediately start to use the newly renamed file. #### Packet Load Balancing #### Stenotype takes advantage of AF_PACKET's excellent load-balancing options to split up the work of processing packets across many CPUs. It uses AF_PACKET's PACKET_FANOUT to create a separate memory region for N different threads, then request that the kernel split up incoming packets across these regions. One stentoype packet reading/writing thread is created for each of these regions. Within that single thread, block processing (reading in a block, indexing it, starting an async write, reading the next block, etc...) happens serially. #### Indexing #### After getting a block of packets from the kernel but before passing them back to be written out, stenotype reads through each packet and creates a small number of indexes in memory. These indexes are very simple, mapping a packet attribute to a file seek offset. Attributes we use include ports (src and dst), protocols (udp/tcp/etc) and IPs (v4 and v6). Indexes are dumped to disk when file rotation happens, with a corresponding index file created for each packet file, of the same name but in a different directory. Given the example above, when the .1422693160230282 -> 1422693160230282 file rotation happens, an index also named .1422693160230282 will be created and written, then renamed to 1422693160230282 when the index has been fully flushed to disk. Once both the packets directory and index directory have a 1422693160230282 file, stenographer can read both in and use the index to lookup packets. #### Index File Format #### Indexes are leveldb SSTables, a simple, compressed file format that stores key-value pairs sorted by key and provides simple, efficient mechanisms to query individual keys or key ranges. Among other things, leveldb tables give us great compression capabilities, keeping our indexes small while still providing fast reads. We store each attribute (port number, protocol number, IP, etc) and its associated packet positions in the blockfile using the format: Key: [type (1 byte)][value (? bytes)] Value: [position 0 (4 bytes)][position 1 (4 bytes)] ... The type specifies the type of attribute being indexed (1 == protocol, 2 == port, 4 == IPv4, 6 == IPv6). The value is 1 byte for protocol, 2 for ports, 4 and 16 respectively for IPv4 and IPv6 addresses. Each position is a seek offset into a packet file (which are guaranteed to not exceed 4GB) and are always exactly 4 bytes long. All values (ports, protocols, positions) are big endian. Looking up packets involves reading key for a specific attribute to get all positions for that value, then seeking into the packet files to find the packets in question and returning them. For example, to find all packets with port 80, you'd read in the positions for key: [\x02 (type=port) \x00\x50 (value=80)] #### Index Writing #### The main stenotype packet sniffing thread tries to very quickly read in packet blocks, index them, then pass them back to the kernel. It does all disk operations asynchronously, in order to keep its CPU busy with indexing, by far the most time-intensive part of the whole operation. It would be extremely detrimental to performance to have this thread block on each file rotation to convert in-memory indexes to on-disk indexes and write out index files. Because of this, index writing is relegated to a separate thread. For each reading/writing thread, a index-writing thread is created, and a thread-safe producer-consumer queue created to link them up. When the reader/writer wants to rotate a file, it simply passes a pointer to its in-memory index over the queue, then creates a new empty index and starts populating it with packet data for its new file. The index-writing thread sits in an endless loop, watching the queue for new indexes. When it gets a new index, it creates a leveldb table, iterates through the index to populate that table, and flushes that table to disk. Since index writing takes (in our experience) far less time/energy than packet writing, the index-writing thread does all of its operations serially, blocking while the index is flushed to disk, then moving that index into its usable (non-hidden) location. ### Stenoread/Stenocurl ### As detailed above in Stenographer's "Access Control" section, we require TLS handshakes in order to verify that clients are indeed allowed access to packet data. To aid in this, the simple shell script `stenocurl` wraps the `curl` utility, adding the various flags necessary to use the correct client certificate and verify against the correct server certificate. `stenoread` is a simple addition to stenocurl, which takes in a query string, passes the query to stenocurl as a POST request, then passes the resulting PCAP file through tcpdump in order to allow for additional filtering, writing to disk, printing in a human-readable format, etc. #### How Queries Work #### An analyst that wants to query stenographer calls the `stenoread` script, passing in a query string (see README.md for the query language format). This string is then POST'd (via stenocurl, using TLS certs/keys) to stenographer. Stenographer parses the query into a Query object, which allows it to decide: * which index files it should read * which keys it should read from each index file * how it should combine packet file positions it gets from each key To illustrate, for the query string (port 1 or ip proto 2) and after 3h ago Stenographer would translate: * `after 3h ago` -> only read index files with microsecond names greater than (now() - 3h) * within these files, compute the union (because of the `or`) of position sets from * key `\x02\x00\x01` (port == 1) * key `\x01\x02` (protocol == 2) Once it has computed a set of packet positions for each index file, it then seeks in the corresponding packet files, reads the packets out, and merges them into a single PCAP file which it serves back to the analyst. This PCAP file comes back via stenocurl as a stream to STDOUT, where stenoread passes it through tcpdump. With no additional options, tcpdump just prints the packet data out in a nice format. With various options, tcpdump could do further filtering (by TCP flags, etc), write its input to disk (-w out.pcap), or do all the other things tcpdump is so good at. ### gRPC ### Stenographer has gRPC support that enables secure, remote interactions with the program. Given the sensitive nature of packet data and the requirements of many users to manage a fleet of servers running Stenographer, the gRPC channel only supports encryption with client authentication and expects the administrator to use certificates that are managed separately from those generated by `stenokeys.sh` (for easily generating certificates, take a look at Square's [certstrap](https://github.com/square/certstrap) utility). The protobuf that defines Stenographer's gRPC service can be found in protobuf/steno.proto. gRPC support is optional and can be enabled by adding an Rpc dictionary of settings to `steno.conf`. An example configuration is shown below: ```json , "Rpc": { "CaCert": "/path/to/rpc/ca/cert" , "ServerKey": "/path/to/rpc/key" , "ServerCert": "/path/to/rpc/cert" , "ServerPort": 8443 , "ServerPcapPath": "/path/to/rpc/pcap/directory" , "ServerPcapMaxSize": 1000000000 , "ClientPcapChunkSize": 1000 , "ClientPcapMaxSize": 5000000 } ``` #### RetrievePcap #### This call allows clients to remotely retrieve PCAP via `stenoread`. To retrieve PCAP, clients send the service a unique identifier, the size of PCAP file chunks to stream in return, the maximum size of the PCAP file to return, and the `stenoread` query used to parse packet data. In response, clients receive streams of messages containing the unique identifier and PCAP file chunks (which need to be reassembled client-side). Below is a minimalist example (shown in Python) of how a client can request PCAP and save it to local disk: ```py with grpc.secure_channel(server, creds) as channel: stub = steno_pb2_grpc.StenographerStub(channel) pb = steno_pb2.PcapRequest() pb.uid = str(uuid.uuid4()) pb.chunk_size = 1000 pb.max_size = 500000 pb.query = 'after 5m ago and tcp' pcap_file = os.path.join('.', '{}.pcap'.format(uid)) with open(pcap_file, 'wb') as fout: for response in stub.RetrievePcap(pb): fout.write(response.pcap) ``` `RetrievePcap` requires the gRPC server to be configured with the following fields (in addition to any fields that require the server to startup): - ServerPcapPath: local path to the directory where `stenoread` PCAP is temporarily stored - ServerPcapMaxSize: upper limit on how much PCAP a client is allowed to receive (used to restrict clients from receiving excessively large PCAPs) - ClientPcapChunkSize: size of the PCAP chunks to stream to the client (used if the client has not specified a size in the request) - ClientPcapMaxSize: upper limit on how much PCAP a client will receive (used if the client has not specified a size in the request) ### Defense In Depth ### #### Stenotype #### We're pretty scared of stenotype, because: 1. We're processing untrusted data: packet 2. We've got very strong permissions: the ability to read packets 3. It's written in a memory-unsafe language: C++ 4. We're not perfect. Because of this, we've tried to use security best practices to minimize the risk of running these binaries with the following methods: * Runing as an unprivileged user `stenographer` * We `setcap` the stenotype binary to just have the ability to read raw packets. * If you DON'T want to use `setcap`, we also offer the ability to drop privileges with `setuid/setgid` after starting `stenotype`... you can start it as `root`, then drop privs to an untrusted user (that user must still be able to open/write files in the index/packet directories). * `seccomp` sandboxing: `stenotype` sandboxes itself after opening up sockets for packet reading. This sandbox isn't particularly granular, but it should stop us from doing anything too crazy if the `stenotype` binary is compromized. * Fuzzing: We've extracted the most concerning bit of code (the indexing code that processes packet data) and fuzzed it as best we can, using the excellent [AFL](http://lcamtuf.coredump.cx/afl/) fuzzer. If you'd like to run your own fuzzing, install AFL, then run `make fuzz` in the `stenotype/` subdirectory, and watch your CPUs become forced-air heaters. * We're considering AppArmor, and may add some configs to use it for locking down stenotype as well. #### Stenographer #### We're slightly less concerned about `stenographer`, since it doesn't actually process packet information. It also has a smaller attack surface, especially when bound to localhost. Our major attack vector in `stenographer` is queries coming in over TLS. However, TLS certificate handling is all done with the Go standard library (which we trust prett well ;), so our code only ever touches queries that come from a user in the `stenographer` group. Since we run it as user `stenographer`, if someone in the `stenographer` group does achieve a shell, they'll be able to... read packets. The big concern here is that they'll be able to read more packets than allowed by default (let's say that we've passed in a BPF filter to stenotype, for example). Our primary defenses, then, are: * Running as an unprivileged user `stenographer` * Using Go's standard library TLS to reject requests not coming from relatively trusted users * Using Go, which is much more memory-safe (runtime array bounds checks, etc) * We're considering AppArmor here, too, and will update this doc if we come up with good configs. Design Limitations ------------------ Some of Stenographer's design decisions make it perform poorly in certain environments or give it strange performance characteristics. This section aims to point these out in advance, so folks have a better understanding of some of the idiosyncracies they may see when deploying Stenographer. ### Slow Links, Large Files ### Stenographer is optimized for fast links, and some of those optimizations give it strange behavior on slow links. The first of these is file size. You may notice that on a network link that's REALLY slow, you'll still see 6MB files created every minute. This is because currently, Stenographer will: * Store packets in 1MB _blocks_ * Flush one _block_ every 10 seconds Of course, if your link generates over 1MB every 10 seconds, this doesn't matter to you at all. If it does, though, you're going to waste disk space. We're considering flushing one block a minute or every thirty seconds. ### Packets Don't Show Up Immediately ### With `stenotype` writing files and `stenographer` reading them, a packet won't show up in a request's response until it's on disk, its index is on disk, and `stenographer` has noticed both of these things occurring. This means that packets are generally 1-2 minutes behind real-time, since * Packets are stored by the kernel for up to 10 seconds before being written to disk * Packet files flush every minute * Index files created/flushed starting when packet files are written * `stenographer` looks for new files on disk every 15 seconds Altogether, this means that there's a maximum 100-120 second delay between `stenotype` seeing a packet and `stenographer` being able to serve that packet based on analyst requests. Note that for fast links, this time is reduced slightly, since: * Stenotype flushes a block whenever it gets 1MB of packets, reducing the initial 10-second wait for the kernel. * `stenotype` flushes at 1 minute OR at 4GB, whichever comes first, so if you get over 4GB/min, you'll flush files/indexes faster than once a minute. stenographer-1.0.1/INSTALL.md000066400000000000000000000211451372346644400156620ustar00rootroot00000000000000Installing Stenographer ======================= If you'd prefer to read a shell script, you can take a look at install.sh :) Also, we do plan to eventually create a real debian package config, and once that's done we'll provide deb packages for easier installation. This documentation provides our current method for installing stenographer on a machine, including the justifications for why we think that method is currently best-practice (or why it's not ;) User/Group Setup ---------------- We set up a single system group `stenographer`, and a system user `stenographer`. ### Group `stenographer` ### The `stenographer` group is used to control access to locally stored packet data. Users are added to this group to allow them to query stenographer (via the `stenoread` command). ### User `stenographer` ### The `stenographer` user is used to run the `stenographer` and `stenotype` binaries. We use the `stenographer` user to protect the system from `stenographer` and `stenotype`... this user has no special privileges except the ability to read/write packet data and run the (setcap'd) stenotype binary. So if either is compromised, the system as a whole won't be. See the **Defense In Depth** section of DESIGN.md for more details. Configuration Files ------------------- There are a number of files in the `configs/` subdirectory, which may help in installation. * `steno.conf`: Discussed in more detail in the **Configuration File** section. A configuration file must exist at `/etc/stenographer/config` * `upstart.conf`: Upstart configuration file, can be copied into `/etc/init/stenographer.conf` to allow upstart to manage Stenographer * `limits.conf`: If you don't use upstart, you may need to move this to `/etc/security/limits.d/stenographer.conf` in order to allow Stenographer to create files at the size it needs, and to open the number of files it needs to Needed Directories ------------------ There are a few directories Stenographer needs in order to run correctly: * `/etc/stenographer root:root/0755`: Stores configuration file * `/etc/stenographer/certs stenographer:stenographer/0750`: Stores certificates used to verify clients are allowed to access packet data. `stenographer` writes certificates for client and server to this directory, and read clients `stenoread/stenocurl` read these certs and use them to make requests. * Packet directories: These are chosen by the installer. See **Configuration File** section for more details. Configuration File ------------------ The `/etc/stenographer/config` file tells Stenographer what packets to read, where to write them, how to serve them, etc. It also tells the clients where the Stenographer server is running and how to query it. Here's an example config (note: it's JSON): { "Threads": [ { "PacketsDirectory": "/disk1/stenopkt", "IndexDirectory": "/disk3/stenoidx/disk1"} , { "PacketsDirectory": "/disk2/stenopkt", "IndexDirectory": "/disk3/stenoidx/disk2", "DiskFreePercentage": 25} ] , "StenotypePath": "/usr/local/bin/stenotype" , "Interface": "em1" , "Port": 1234 , "Flags": [] , "CertPath": "/etc/stenographer/certs" } Let's look at each part of this in detail: * `StenotypePath`: Where `stenographer` can find the `stenotype` binary, which it runs as a subprocess * `Interface`: Network interface to read packets from * `Port`: Port `stenographer` will bind to in order to serve `stenoread` requests. * `CertPath`: Where `stenographer` will write certificates for client verification, and where the clients will read certificates when issuing queries. ### Threads ### The `Threads` section is one of the most important. It tells `stenotype`, the packet capturing subprocess, a number of things: how many threads to read packets with, where to store those packets, and how to clean them up. For each packet reading thread you'd like to run (IE: for each core you'd like to use), you must specify: * `PacketsDirectory`: Where to write packet files. We recommend mounting a separate disk for each thread... we've found that at least for spinning disks, a single core can easily fill a disk's entire write throughput with room to spare. * `IndexDirectory`: Where to write index files. We've had good luck with using a single separate disk to write all index files, writing each thread's index to a separate subdirectory. This directory gets FAR fewer writes, and they're FAR smaller. We've found that even with up to 8 threads, the all 8 index directories take up less than 20% of the space of a single thread's packets. * `DiskFreePercentage`: The amount of space to keep free in the *packets* directory. `stenographer` will delete files in this thread's packets directory when free disk space decreases below this percentage. Note that we don't currently do any automated cleanup of the index directory. When a packet file is cleaned up, its index file is cleaned up, and because index files take so little space, we haven't ever needed to clean them up directly. Note that `DiskFreePercentage` is optional... it defaults to 10%. * `MaxDirectoryFiles`: The maximum number of packet/index files to create before cleaning old ones up. Defaults to 30K files, to avoid issues with ext3's 32K file-per-directory maximums. For ext4 you should be able to go higher without issue. Note that since we create at least one file every minute, this defaults to a maximum limit of 8 1/3 days before we drop old packets. ### Flags ### The `Flags` section allows you to specify flags to pass to the `stenotype` binary. Here are some flags which may prove particularly useful: * `-v`: Add verbosity to logging. Logs by default are written to syslog, and are relatively quiet. Adding one `-v` will have stenotype write per-thread capture statistics every minute or 100MB of packets, whichever comes first. Adding more `-v` flags will provide you with reams of debugging information. * `--blocks=NUM`: The number of 1MB packet blocks used by AF_PACKET to store packets in memory, *per thread*. This flag basically allows you to control how much RAM the `stenotype` binary uses: `blocks * threads * 1MB`. More blocks will allow a thread to handle traffic spikes: if you have 2048 blocks (the default), then a thread can hold 2GB of traffic in memory while waiting for it to hit disk. If you have slow links and you want to decrease memory usage, you can probably decrease this a LOT. :) * `--fanout_type=NUM`: This sets the AF_PACKET fanout type to the passed-in value. See AF_PACKET documentation for details on options here. The default should probably be fine. * `--filter=HEX`: Allows users to specify a BPF filter for packet capture... only packets which match this filter will be written by `stenotype`. This is NOT a human-readable BPF filter... it's a hex-encoded compiled filter. Use the supplied `compile_bpf.sh` script to generate this encoding from a human-readable filter. * `--seccomp=none|trace|kill`: We use seccomp to sandbox stenotype, but we've found that this can be fragile as we switch between different machine configurations. Some VMs appear to freeze while trying to set up seccomp sandboxes: for those environments, you can pass `--seccomp=none` in (note that this will turn off some sandboxing). If you're trying to debug a `stenotype` failure you think is caused by overzealous sandboxing, you can pass in `--seccomp=trace`, then run stenotype with `strace` to figure out why things are misbehaving. * `--preallocate_file_mb=NUM`: Certain file systems handle writes faster if the file has already been allocated to its eventual size. If you set this flag to `4096`, then stenotype will preallocate each new packet file to this size while opening it. The file will be truncated to its actual size when closed. This should not be necessary unless you're really trying to eak out some extra speed on a file system that supports extents. NOTE: If you are using xfs on 4.1+ kernel, setting "--preallocate_file_mb" becomes crucial to performance. This is in that when appending to the end of file (i.e. EOF updates), kernel will serialize all operations. Please refer to commit (b9d5984 xfs: DIO write completion size updates race). There's a number of other flags that `stenotype` supports, but most of them are for debugging purposes. stenographer-1.0.1/LICENSE000066400000000000000000000261361372346644400152440ustar00rootroot00000000000000 Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. stenographer-1.0.1/README.md000066400000000000000000000115021372346644400155050ustar00rootroot00000000000000Stenographer ============ Overview -------- Stenographer is a full-packet-capture utility for buffering packets to disk for intrusion detection and incident response purposes. It provides a high-performance implementation of NIC-to-disk packet writing, handles deleting those files as disk fills up, and provides methods for reading back specific sets of packets quickly and easily. It is designed to: * Write packets to disk, very quickly (~10Gbps on multi-core, multi-disk machines) * Store as much history as it can (managing disk usage, storing longer durations when traffic slows, then deleting the oldest packets when it hits disk limits) * Read a very small percentage (<1%) of packets from disk based on analyst needs It is NOT designed for: * Complex packet processing (TCP stream reassembly, etc) * It’s fast because it doesn’t do this.  Even with the very minimal, single-pass processing of packets we do, processing ~1Gbps for indexing alone can take >75% of a single core. * Processing the data by reading it back from disk also doesn’t work:  see next bullet point. * Reading back large amounts of packets (> 1% of packets written) * The key concept here is that disk reads compete with disk writes… you can write at 90% of disk speed, but that only gives you 10% of your disk’s time for reading.  Also, we’re writing highly sequential data, which disks are very good at doing quickly, and generally reading back sparse data with lots of seeks, which disks do slowly. For further reading, check out **[DESIGN.md](DESIGN.md)** for a discussion of stenographer's design, or read **[INSTALL.md](INSTALL.md)** for how to install stenographer on a machine. Querying -------- ### Query Language ### A user requests packets from stenographer by specifying them with a very simple query language. This language is a simple subset of BPF, and includes the primitives: host 8.8.8.8 # Single IP address (hostnames not allowed) net 1.0.0.0/8 # Network with CIDR net 1.0.0.0 mask 255.255.255.0 # Network with mask port 80 # Port number (UDP or TCP) ip proto 6 # IP protocol number 6 icmp # equivalent to 'ip proto 1' tcp # equivalent to 'ip proto 6' udp # equivalent to 'ip proto 17' # Stenographer-specific time additions: before 2012-11-03T11:05:00Z # Packets before a specific time (UTC) after 2012-11-03T11:05:00-07:00 # Packets after a specific time (with TZ) before 45m ago # Packets before a relative time before 3h ago # Packets after a relative time **NOTE**: Relative times must be measured in integer values of hours or minutes as demonstrated above. Primitives can be combined with and/&& and with or/||, which have equal precendence and evaluate left-to-right. Parens can also be used to group. (udp and port 514) or (tcp and port 8080) ### Stenoread CLI ### The *stenoread* command line script automates pulling packets from Stenographer and presenting them in a usable format to analysts. It requests raw packets from stenographer, then runs them through *tcpdump* to provide a more full-featured formatting/filtering experience. The first argument to *stenoread* is a stenographer query (see 'Query Language' above). All other arguments are passed to *tcpdump*. For example: # Request all packets from IP 1.2.3.4 port 6543, then do extra filtering by # TCP flag, which typical stenographer does not support. $ stenoread 'host 1.2.3.4 and port 6543' 'tcp[tcpflags] & tcp-push != 0' # Request packets on port 8765, disabling IP resolution (-n) and showing # link-level headers (-e) when printing them out. $ stenoread 'port 8765' -n -e # Request packets for any IPs in the range 1.1.1.0-1.1.1.255, writing them # out to a local PCAP file so they can be opened in Wireshark. $ stenoread 'net 1.1.1.0/24' -w /tmp/output_for_wireshark.pcap Downloading ----------- To download the source code, install Go locally, then run: $ go get github.com/google/stenographer Go will handle downloading and installing all Go libraries that `stenographer` depends on. To build `stenotype`, go into the `stenotype` directory and run `make`. You may need to install the following Ubuntu packages (or their equivalents on other Linux distros): * libaio-dev * libleveldb-dev * libsnappy-dev * g++ * libcap2-bin * libseccomp-dev Obligatory Fine Print --------------------- This is not an official Google product (experimental or otherwise), it is just code that happens to be owned by Google. This code is not intended (or used) to watch Google's users. Its purpose is to increase security on our networks by augmenting our internal monitoring capabilities. stenographer-1.0.1/base/000077500000000000000000000000001372346644400151415ustar00rootroot00000000000000stenographer-1.0.1/base/base.go000066400000000000000000000265301372346644400164100ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package base provides common utilities for other stenographer libraries. package base import ( "container/heap" "flag" "fmt" "io" "log" "net/http" "sort" "strconv" "sync" "syscall" "time" "github.com/google/gopacket" "github.com/google/gopacket/layers" "github.com/google/gopacket/pcapgo" "golang.org/x/net/context" ) var VerboseLogging = flag.Int("v", -1, "log many verbose logs") // V provides verbose logging which can be turned on/off with the -v flag. func V(level int, fmt string, args ...interface{}) { if *VerboseLogging >= level { log.Printf(fmt, args...) } } // Packet is a single packet with its metadata. type Packet struct { Data []byte // The actual bytes that make up the packet gopacket.CaptureInfo // Metadata about when/how the packet was captured } // PacketChan provides an async method for passing multiple ordered packets // between goroutines. type PacketChan struct { mu sync.Mutex c chan *Packet // C can be used to send packets on this channel in a select. Do NOT // call 'close' on it... instead call the Close function. C chan<- *Packet err error done chan struct{} } // Receive provides the channel from which to read packets. It always // returns the same channel. func (p *PacketChan) Receive() <-chan *Packet { return p.c } // Send sends a single packet on the channel to the receiver. func (p *PacketChan) Send(pkt *Packet) { p.c <- pkt } // Close closes the sending channel and sets the PacketChan's error based // in its input. func (p *PacketChan) Close(err error) { p.mu.Lock() p.err = err p.mu.Unlock() close(p.c) close(p.done) } // Done returns a channel that is closed when this packet channel is complete. func (p *PacketChan) Done() <-chan struct{} { return p.done } // NewPacketChan returns a new PacketChan channel for passing packets around. func NewPacketChan(buffer int) *PacketChan { pc := &PacketChan{ c: make(chan *Packet, buffer), done: make(chan struct{}), } pc.C = pc.c return pc } // Discard discards all remaining packets on the receiving end. If you stop // using the channel before reading all packets, you must call this function. // It's a good idea to defer this regardless. func (p *PacketChan) Discard() { go func() { discarded := 0 for _ = range p.c { discarded++ } if discarded > 0 { V(2, "discarded %v", discarded) } }() } // Err gets the current error for the channel, if any exists. This may be // called during Next(), but if an error occurs it may only be set after Next() // returns false the first time. func (p *PacketChan) Err() error { p.mu.Lock() defer p.mu.Unlock() return p.err } // indexedPacket is used internally by MergePacketChans. type indexedPacket struct { *Packet i int } // packetHeap is used internally by MergePacketChans. type packetHeap []indexedPacket func (p packetHeap) Len() int { return len(p) } func (p packetHeap) Swap(i, j int) { p[i], p[j] = p[j], p[i] } func (p packetHeap) Less(i, j int) bool { return p[i].Timestamp.Before(p[j].Timestamp) } func (p *packetHeap) Push(x interface{}) { *p = append(*p, x.(indexedPacket)) } func (p *packetHeap) Pop() (x interface{}) { index := len(*p) - 1 *p, x = (*p)[:index], (*p)[index] return } // ConcatPacketChans concatenates packet chans in order. func ConcatPacketChans(ctx context.Context, in <-chan *PacketChan) *PacketChan { out := NewPacketChan(100) go func() { for c := range in { c := c defer c.Discard() L: for c.Err() == nil { select { case pkt := <-c.Receive(): if pkt == nil { break L } out.Send(pkt) case <-ctx.Done(): out.Close(ctx.Err()) return } } if err := c.Err(); err != nil { out.Close(err) return } } out.Close(nil) }() return out } // MergePacketChans merges an incoming set of packet chans, each sorted by // time, returning a new single packet chan that's also sorted by time. func MergePacketChans(ctx context.Context, in []*PacketChan) *PacketChan { out := NewPacketChan(100) go func() { count := 0 defer func() { V(1, "merged %d streams for %d total packets", len(in), count) }() var h packetHeap for i := range in { defer in[i].Discard() } for i, c := range in { select { case pkt := <-c.Receive(): if pkt != nil { heap.Push(&h, indexedPacket{Packet: pkt, i: i}) } if err := c.Err(); err != nil { out.Close(err) return } case <-ctx.Done(): out.Close(ctx.Err()) return } } for h.Len() > 0 && !ContextDone(ctx) { p := heap.Pop(&h).(indexedPacket) count++ if pkt := <-in[p.i].Receive(); pkt != nil { heap.Push(&h, indexedPacket{Packet: pkt, i: p.i}) } out.c <- p.Packet if err := in[p.i].Err(); err != nil { out.Close(err) return } } out.Close(ctx.Err()) }() return out } // Positions detail the offsets of packets within a blockfile. type Positions []int64 var ( AllPositions = Positions{-1} NoPositions = Positions{} ) func (p Positions) IsAllPositions() bool { return len(p) == 1 && p[0] == -1 } func (a Positions) Less(i, j int) bool { return a[i] < a[j] } func (a Positions) Swap(i, j int) { a[i], a[j] = a[j], a[i] } func (a Positions) Len() int { return len(a) } func (a Positions) Sort() { sort.Sort(a) } // Union returns the union of a and b. a and b must be sorted in advance. // Returned slice will be sorted. // a or b may be returned by Union, but neither a nor b will be modified. func (a Positions) Union(b Positions) (out Positions) { switch { case a.IsAllPositions(): return a case b.IsAllPositions(): return b case len(a) == 0: return b case len(b) == 0: return a } out = make(Positions, 0, len(a)+len(b)/2) ib := 0 for _, pos := range a { for ib < len(b) && b[ib] < pos { out = append(out, b[ib]) ib++ } if ib < len(b) && b[ib] == pos { ib++ } out = append(out, pos) } out = append(out, b[ib:]...) return out } // Intersect returns the intersection of a and b. a and b must be sorted in // advance. Returned slice will be sorted. // a or b may be returned by Intersect, but neither a nor b will be modified. func (a Positions) Intersect(b Positions) (out Positions) { switch { case a.IsAllPositions(): return b case b.IsAllPositions(): return a case len(a) == 0: return a case len(b) == 0: return b } out = make(Positions, 0, len(a)/2) ib := 0 for _, pos := range a { for ib < len(b) && b[ib] < pos { ib++ } if ib < len(b) && b[ib] == pos { out = append(out, pos) ib++ } } return out } func PathDiskFreePercentage(path string) (int, error) { var stat syscall.Statfs_t if err := syscall.Statfs(path, &stat); err != nil { return 0, err } return int(100 * stat.Bavail / stat.Blocks), nil } // snapLen is the max packet size we'll return in pcap files to users. const snapLen = 65536 // PacketsToFile writes all packets from 'in' to 'out', writing out all packets // in a valid PCAP file format. func PacketsToFile(in *PacketChan, out io.Writer, limit Limit) error { w := pcapgo.NewWriter(out) w.WriteFileHeader(snapLen, layers.LinkTypeEthernet) count := 0 defer in.Discard() defer func() { V(1, "wrote %d packets of %d input packets", count, len(in.C)) }() const pcapHeaderSize = 16 // same for file header and per-packet header // If someone REALLY wants an empty pcap file, we'll give it to them :P if limit.ShouldStopAfter(Limit{Bytes: pcapHeaderSize}) { return nil } for p := range in.Receive() { if len(p.Data) > snapLen { p.Data = p.Data[:snapLen] } if err := w.WritePacket(p.CaptureInfo, p.Data); err != nil { // This can happen if our pipe is broken, and we don't want to blow stack // traces all over our users when that happens, so Error/Exit instead of // Fatal. return fmt.Errorf("error writing packet: %v", err) } count++ if limit.ShouldStopAfter(Limit{Bytes: int64(len(p.Data) + pcapHeaderSize), Packets: 1}) { return nil } } return in.Err() } // ContextDone returns true if a context is complete. func ContextDone(ctx context.Context) bool { // There's two ways we could do this: by checking ctx.Done or by // seeing if ctx.Err != nil. The latter, though, uses a single // exclusive mutex, so when the context is shared by a ton of // goroutines, it can actually block things quite a bit. Checking // ctx.Done is much more scalable across multiple goroutines. select { case <-ctx.Done(): return true default: return false } } // Watchdog returns a time.Timer which log.Fatals if it goes off. // The creator must call Stop before that time (to never die) // or Reset (to postpone the inevitable). // // Usage: // func couldGetStuck() { // defer base.Watchdog(time.Minute * 5, "my description").Stop() // ... do stuff ... // } // // Or: // func couldGetStuckOnManyThings(things []thing) { // fido := base.Watchdog(time.Second * 15) // defer fido.Stop() // initialize() // can take up to 15 secs // for _, thing := range things { // fido.Reset(time.Second * 5) // process(thing) // can take up to 5 seconds each // } // } func Watchdog(d time.Duration, msg string) *time.Timer { return time.AfterFunc(d, func() { log.Fatalf("watchdog failed: %v", msg) }) } // Limit is the amount of data we want to return, or the amount taken by a // single upload. type Limit struct { Bytes, Packets int64 } func dec(a *int64, b int64) bool { if *a != 0 && b != 0 { *a -= b return *a <= 0 } return false } // ShouldStopAfter returns true if output should stop after the next update, // where the next update is of size 'b'. func (a *Limit) ShouldStopAfter(b Limit) bool { bytes := dec(&a.Bytes, b.Bytes) packets := dec(&a.Packets, b.Packets) return bytes || packets } // LimitFromHeaders returns a Limit based on HTTP headers. func LimitFromHeaders(h http.Header) (a Limit, err error) { if limitStr := h.Get("Steno-Limit-Bytes"); limitStr != "" { if a.Bytes, err = strconv.ParseInt(limitStr, 0, 64); err != nil { return } } if limitStr := h.Get("Steno-Limit-Packets"); limitStr != "" { if a.Packets, err = strconv.ParseInt(limitStr, 0, 64); err != nil { return } } return } // Context wraps a context.Context with its cancel function. type Context interface { context.Context Cancel() } type contextWithCancel struct { context.Context cancel context.CancelFunc } // Cancel cancels this context. func (c *contextWithCancel) Cancel() { c.cancel() } // WrapContext wraps a context.Context in our type of Context. // Timeout of zero means never time out. // Cancel function should be called when operation is completed. func NewContext(timeout time.Duration) Context { var c context.Context var cancel context.CancelFunc if timeout == 0 { c, cancel = context.WithCancel(context.Background()) } else { c, cancel = context.WithTimeout(context.Background(), timeout) } return &contextWithCancel{c, cancel} } stenographer-1.0.1/base/base_test.go000066400000000000000000000104001372346644400174340ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. package base import ( "bytes" "reflect" "testing" "time" "github.com/google/gopacket" "golang.org/x/net/context" ) var ctx = context.Background() func testPacketData(t *testing.T) []*Packet { var ci = []gopacket.CaptureInfo{ {Timestamp: time.Unix(123, 123), CaptureLength: 3, Length: 3}, {Timestamp: time.Unix(456, 456), CaptureLength: 3, Length: 3}, {Timestamp: time.Unix(789, 789), CaptureLength: 3, Length: 3}, } out := []*Packet{&Packet{[]byte{1, 2, 3}, ci[0]}, &Packet{[]byte{4, 5, 6}, ci[1]}, &Packet{[]byte{7, 8, 9}, ci[2]}} return out } func comparePacketChans(t *testing.T, want, got *PacketChan) { for { p1, ok1 := <-want.Receive() p2, ok2 := <-got.Receive() if !ok1 || !ok2 { if ok1 != ok2 { t.Errorf("missing packet:\nwant:%v\ngot:%v\n", p1, p2) } break } if p1 != p2 { t.Errorf("wrong packet\nwant:%v\ngot:%v\n", p1, p2) } } } func TestConcatPacketChans(t *testing.T) { packets := testPacketData(t) inputs := make(chan *PacketChan, 2) one := NewPacketChan(100) two := NewPacketChan(100) one.Send(packets[0]) two.Send(packets[1]) one.Close(nil) two.Close(nil) inputs <- one inputs <- two close(inputs) got := ConcatPacketChans(ctx, inputs) want := NewPacketChan(3) want.Send(packets[0]) want.Send(packets[1]) want.Close(nil) comparePacketChans(t, want, got) } func TestMergePacketChans(t *testing.T) { packets := testPacketData(t) one := NewPacketChan(100) two := NewPacketChan(100) inputs := []*PacketChan{one, two} one.Send(packets[1]) two.Send(packets[0]) one.Close(nil) two.Close(nil) got := MergePacketChans(ctx, inputs) want := NewPacketChan(100) want.Send(packets[0]) want.Send(packets[1]) want.Close(nil) comparePacketChans(t, want, got) } func TestUnion(t *testing.T) { for _, test := range []struct { a, b, want Positions }{ { Positions{1, 2, 3}, Positions{2, 3, 4}, Positions{1, 2, 3, 4}, }, { Positions{1, 2}, Positions{3, 4}, Positions{1, 2, 3, 4}, }, { Positions{3, 4}, Positions{1, 2}, Positions{1, 2, 3, 4}, }, } { got := test.a.Union(test.b) if !reflect.DeepEqual(got, test.want) { t.Errorf("nope:\n a: %v\n b: %v\n got: %v\nwant: %v", test.a, test.b, got, test.want) } } } func TestIntersect(t *testing.T) { for _, test := range []struct { a, b, want Positions }{ { Positions{1, 2, 3, 4}, Positions{0, 2, 4, 5}, Positions{2, 4}, }, { Positions{1, 2, 3}, Positions{2, 3, 4}, Positions{2, 3}, }, { Positions{1, 2}, Positions{3, 4}, Positions{}, }, { Positions{3, 4}, Positions{1, 2}, Positions{}, }, } { got := test.a.Intersect(test.b) if !reflect.DeepEqual(got, test.want) { t.Errorf("nope:\n a: %v\n b: %v\n got: %v\nwant: %v", test.a, test.b, got, test.want) } } } func TestPacketsToFile(t *testing.T) { var out bytes.Buffer packets := testPacketData(t) pc := NewPacketChan(100) pc.Send(packets[0]) pc.Close(nil) want := []byte{ 0xd4, 0xc3, 0xb2, 0xa1, 0x02, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0x00, 0x7b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, } PacketsToFile(pc, &out, Limit{}) if got := out.Bytes(); !bytes.Equal(want, got) { t.Errorf("wrong packets:\nwant: %+v\ngot: %+v", want, got) } } func TestContextDone(t *testing.T) { ctx := NewContext(0) if ContextDone(ctx) { t.Fatal("shouldn't be done yet") } ctx.Cancel() if !ContextDone(ctx) { t.Fatal("should be done now") } ctx = NewContext(time.Microsecond) time.Sleep(time.Millisecond) if !ContextDone(ctx) { t.Fatal("should have timed out by now") } } stenographer-1.0.1/blockfile/000077500000000000000000000000001372346644400161615ustar00rootroot00000000000000stenographer-1.0.1/blockfile/blockfile.go000066400000000000000000000207441372346644400204510ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package blockfile provides methods for reading packets from blockfiles // generated by stenotype. package blockfile import ( "errors" "fmt" "io" "sync" "time" "unsafe" "github.com/google/gopacket" "github.com/google/stenographer/base" "github.com/google/stenographer/filecache" "github.com/google/stenographer/indexfile" "github.com/google/stenographer/query" "github.com/google/stenographer/stats" "golang.org/x/net/context" ) // #include import "C" var ( v = base.V // Verbose logging packetReadNanos = stats.S.Get("packet_read_nanos") packetScanNanos = stats.S.Get("packet_scan_nanos") packetsRead = stats.S.Get("packets_read") packetsScanned = stats.S.Get("packets_scanned") packetBlocksRead = stats.S.Get("packets_blocks_read") ) // BlockFile provides an interface to a single stenotype file on disk and its // associated index. type BlockFile struct { name string f *filecache.CachedFile i *indexfile.IndexFile mu sync.RWMutex // Stops Close() from invalidating a file before a current query is done with it. done chan struct{} size int64 } // NewBlockFile opens up a named block file (and its index), returning a handle // which can be used to look up packets. func NewBlockFile(filename string, fc *filecache.Cache) (*BlockFile, error) { v(1, "Blockfile opening: %q", filename) i, err := indexfile.NewIndexFile(indexfile.IndexPathFromBlockfilePath(filename), fc) if err != nil { return nil, fmt.Errorf("could not open index for %q: %v", filename, err) } f := fc.Open(filename) s, err := f.Stat() if err != nil { f.Close() return nil, fmt.Errorf("could not stat file %q: %v", filename, err) } return &BlockFile{ f: f, i: i, name: filename, done: make(chan struct{}), size: s.Size(), }, nil } // Name returns the name of the file underlying this blockfile. func (b *BlockFile) Name() string { return b.name } // Size returns the size of the blockfile in bytes. func (b *BlockFile) Size() int64 { return b.size } // readPacket reads a single packet from the file at the given position. // It updates the passed in CaptureInfo with information on the packet. func (b *BlockFile) readPacket(pos int64, ci *gopacket.CaptureInfo) ([]byte, error) { // 28 bytes actually isn't the entire packet header, but it's all the fields // that we care about. packetsRead.Increment() defer packetReadNanos.NanoTimer()() var dataBuf [28]byte _, err := b.f.ReadAt(dataBuf[:], pos) if err != nil { return nil, err } pkt := (*C.struct_tpacket3_hdr)(unsafe.Pointer(&dataBuf[0])) *ci = gopacket.CaptureInfo{ Timestamp: time.Unix(int64(pkt.tp_sec), int64(pkt.tp_nsec)), Length: int(pkt.tp_len), CaptureLength: int(pkt.tp_snaplen), } out := make([]byte, ci.CaptureLength) pos += int64(pkt.tp_mac) _, err = b.f.ReadAt(out, pos) return out, err } // Close cleans up this blockfile. func (b *BlockFile) Close() (err error) { v(2, "Blockfile closing: %q", b.name) close(b.done) b.mu.Lock() defer b.mu.Unlock() v(3, "Blockfile closing file descriptors: %q", b.name) if e := b.i.Close(); e != nil { err = e } if e := b.f.Close(); e != nil { err = e } b.i, b.f = nil, nil return } // allPacketsIter implements Iter. type allPacketsIter struct { *BlockFile blockData []byte block *C.struct_tpacket_hdr_v1 pkt *C.struct_tpacket3_hdr blockPacketsRead int blockOffset int64 packetOffset int // offset of packet in block err error done bool } func (a *allPacketsIter) Next() bool { defer packetScanNanos.NanoTimer()() if a.err != nil || a.done { return false } for a.block == nil || a.blockPacketsRead == int(a.block.num_pkts) { packetBlocksRead.Increment() a.blockData = make([]byte, 1<<20) _, err := a.f.ReadAt(a.blockData[:], a.blockOffset) if err == io.EOF { a.done = true return false } else if err != nil { a.err = fmt.Errorf("could not read block at %v: %v", a.blockOffset, err) return false } baseHdr := (*C.struct_tpacket_block_desc)(unsafe.Pointer(&a.blockData[0])) a.block = (*C.struct_tpacket_hdr_v1)(unsafe.Pointer(&baseHdr.hdr[0])) a.blockOffset += 1 << 20 a.blockPacketsRead = 0 a.pkt = nil } a.blockPacketsRead++ if a.pkt == nil { a.packetOffset = int(a.block.offset_to_first_pkt) } else if a.pkt.tp_next_offset != 0 { a.packetOffset += int(a.pkt.tp_next_offset) } else { a.err = errors.New("block format currently not supported") return false } a.pkt = (*C.struct_tpacket3_hdr)(unsafe.Pointer(&a.blockData[a.packetOffset])) packetsScanned.Increment() return true } func (a *allPacketsIter) Packet() *base.Packet { start := a.packetOffset + int(a.pkt.tp_mac) buf := a.blockData[start : start+int(a.pkt.tp_snaplen)] p := &base.Packet{Data: buf} p.CaptureInfo.Timestamp = time.Unix(int64(a.pkt.tp_sec), int64(a.pkt.tp_nsec)) p.CaptureInfo.Length = int(a.pkt.tp_len) p.CaptureInfo.CaptureLength = int(a.pkt.tp_snaplen) return p } func (a *allPacketsIter) Err() error { return a.err } // AllPackets returns a packet channel to which all packets in the blockfile are // sent. func (b *BlockFile) AllPackets() *base.PacketChan { b.mu.RLock() c := base.NewPacketChan(100) go func() { defer b.mu.RUnlock() pkts := &allPacketsIter{BlockFile: b} for pkts.Next() { c.Send(pkts.Packet()) } c.Close(pkts.Err()) }() return c } // Positions returns the positions in the blockfile of all packets matched by // the passed-in query. func (b *BlockFile) Positions(ctx context.Context, q query.Query) (base.Positions, error) { b.mu.RLock() defer b.mu.RUnlock() return b.positionsLocked(ctx, q) } // positionsLocked returns the positions in the blockfile of all packets matched by // the passed-in query. b.mu must be locked. func (b *BlockFile) positionsLocked(ctx context.Context, q query.Query) (base.Positions, error) { if b.i == nil || b.f == nil { // If we're closed, just return nothing. return nil, nil } return q.LookupIn(ctx, b.i) } // Lookup returns all packets in the blockfile matched by the passed-in query. func (b *BlockFile) Lookup(ctx context.Context, q query.Query, out *base.PacketChan) { b.mu.RLock() defer b.mu.RUnlock() var ci gopacket.CaptureInfo v(2, "Blockfile %q looking up query %q", b.name, q.String()) start := time.Now() positions, err := b.positionsLocked(ctx, q) if err != nil { out.Close(fmt.Errorf("index lookup failure: %v", err)) return } if positions.IsAllPositions() { v(2, "Blockfile %q reading all packets", b.name) iter := &allPacketsIter{BlockFile: b} all_packets_loop: for iter.Next() { select { case <-ctx.Done(): v(2, "Blockfile %q canceling packet read", b.name) break all_packets_loop case <-b.done: v(2, "Blockfile %q closing, breaking out of query", b.name) break all_packets_loop case out.C <- iter.Packet(): } } if iter.Err() != nil { out.Close(fmt.Errorf("error reading all packets from %q: %v", b.name, iter.Err())) return } } else { v(2, "Blockfile %q reading %v packets", b.name, len(positions)) query_packets_loop: for _, pos := range positions { buffer, err := b.readPacket(pos, &ci) if err != nil { v(2, "Blockfile %q error reading packet: %v", b.name, err) out.Close(fmt.Errorf("error reading packets from %q @ %v: %v", b.name, pos, err)) return } select { case <-ctx.Done(): v(2, "Blockfile %q canceling packet read", b.name) break query_packets_loop case <-b.done: v(2, "Blockfile %q closing, breaking out of query", b.name) break query_packets_loop case out.C <- &base.Packet{Data: buffer, CaptureInfo: ci}: } } } v(2, "Blockfile %q finished reading all packets in %v", b.name, time.Since(start)) out.Close(ctx.Err()) } // DumpIndex dumps out a "human-readable" debug version of the blockfile's index // to the given writer. func (b *BlockFile) DumpIndex(out io.Writer, start, finish []byte) { b.mu.RLock() defer b.mu.RUnlock() b.i.Dump(out, start, finish) } stenographer-1.0.1/blockfile/blockfile_test.go000066400000000000000000000032001372346644400214740ustar00rootroot00000000000000// Copyright 2015 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. package blockfile import ( "reflect" "testing" "golang.org/x/net/context" "github.com/google/stenographer/base" "github.com/google/stenographer/filecache" "github.com/google/stenographer/query" ) var ctx = context.Background() const ( filename = "../testdata/PKT0/dhcp" ) func testBlockFile(t *testing.T, filename string) *BlockFile { blk, err := NewBlockFile(filename, filecache.NewCache(10)) if err != nil { t.Fatal(err) } return blk } func TestPositions(t *testing.T) { blk := testBlockFile(t, filename) defer blk.Close() for _, test := range []struct { // test struct query string want base.Positions }{ // tests {"port 67", base.Positions{1048624, 1049024, 1049448, 1049848}}, {"port 69", nil}, } { // code to run single test if q, err := query.NewQuery(test.query); err != nil { t.Fatal(err) } else if got, err := blk.Positions(ctx, q); err != nil { t.Fatal(err) } else if !reflect.DeepEqual(got, test.want) { t.Errorf("wrong packet positions.\nwant: %v\n got: %v\n", test.want, got) } } } stenographer-1.0.1/certs/000077500000000000000000000000001372346644400153475ustar00rootroot00000000000000stenographer-1.0.1/certs/certs.go000066400000000000000000000031541372346644400170210ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package certs provides helper libraries for generating self-signed // certificates, which we use locally for authorizing users to read // packet data. package certs import ( "crypto/tls" "crypto/x509" "encoding/pem" "fmt" "io/ioutil" ) // ClientVerifyingTLSConfig returns a TLS config which verifies that clients // have a certificate signed by the CA certificate in the certFile. func ClientVerifyingTLSConfig(certFile string) (*tls.Config, error) { // Test cert file var cert *x509.Certificate if certBytes, err := ioutil.ReadFile(certFile); err != nil { return nil, fmt.Errorf("could not read cert file: %v", err) } else if block, _ := pem.Decode(certBytes); block == nil { return nil, fmt.Errorf("could not get cert pem block: %v", err) } else if cert, err = x509.ParseCertificate(block.Bytes); err != nil { return nil, fmt.Errorf("could not parse cert: %v", err) } cas := x509.NewCertPool() cas.AddCert(cert) return &tls.Config{ ClientAuth: tls.RequireAndVerifyClientCert, ClientCAs: cas, }, nil } stenographer-1.0.1/config/000077500000000000000000000000001372346644400154745ustar00rootroot00000000000000stenographer-1.0.1/config/config.go000066400000000000000000000073301372346644400172730ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package config contains the configuration file format for stenographer's main // configuration file. package config import ( "bytes" "encoding/json" "fmt" "io/ioutil" "net" "github.com/google/stenographer/base" ) var v = base.V // verbose logging const ( defaultDiskSpacePercentage = 10 // By default, ext3 has issues with >32k files, so we go for something less // than that. defaultMaxDirectoryFiles = 30000 defaultMaxOpenFiles = 100000 ) // ThreadConfig is a json-decoded configuration for each stenotype thread, // detailing where it should store data and how much disk space it should keep // available on each disk. type ThreadConfig struct { PacketsDirectory string IndexDirectory string DiskFreePercentage int `json:",omitempty"` MaxDirectoryFiles int `json:",omitempty"` } // RpcConfig is a json-decoded configuration for running the gRPC server. type RpcConfig struct { CaCert string ServerKey string ServerCert string ServerPort int ServerPcapPath string ServerPcapMaxSize int64 ClientPcapChunkSize int64 ClientPcapMaxSize int64 } // Config is a json-decoded configuration for running stenographer. type Config struct { Rpc *RpcConfig StenotypePath string Threads []ThreadConfig Interface string TestimonySocket string Flags []string Port int Host string // Location to listen. CertPath string // Directory where client and server certs are stored. MaxOpenFiles int // Max number of file descriptors opened at once } // ReadConfigFile reads in the given JSON encoded configuration file and returns // the Config object associated with the decoded configuration data. func ReadConfigFile(filename string) (*Config, error) { v(0, "Reading config %q", filename) data, err := ioutil.ReadFile(filename) if err != nil { return nil, fmt.Errorf("could not read config file %q: %v", filename, err) } dec := json.NewDecoder(bytes.NewReader(data)) var out Config if err := dec.Decode(&out); err != nil { return nil, fmt.Errorf("could not decode config file %q: %v", filename, err) } if out.MaxOpenFiles <= 0 { out.MaxOpenFiles = defaultMaxOpenFiles } for i, thread := range out.Threads { if thread.DiskFreePercentage <= 0 { out.Threads[i].DiskFreePercentage = defaultDiskSpacePercentage } if thread.MaxDirectoryFiles <= 0 { out.Threads[i].MaxDirectoryFiles = defaultMaxDirectoryFiles } } return &out, nil } // Validate checks the configuration for common errors. func (c Config) Validate() error { for n, thread := range c.Threads { if thread.PacketsDirectory == "" { return fmt.Errorf("No packet directory specified for thread %d in configuration", n) } if thread.IndexDirectory == "" { return fmt.Errorf("No index directory specified for thread %d in configuration", n) } } if len(c.TestimonySocket) > 0 && len(c.Interface) > 0 { return fmt.Errorf("Can't use both \"Interface\" and \"TestimonySocket\" options") } if host := net.ParseIP(c.Host); host == nil { return fmt.Errorf("invalid listening location %q in configuration", c.Host) } return nil } stenographer-1.0.1/configs/000077500000000000000000000000001372346644400156575ustar00rootroot00000000000000stenographer-1.0.1/configs/limits.conf000066400000000000000000000005611372346644400200310ustar00rootroot00000000000000# stenographer limits # # Stenographer generally: # * uses a lot of files # * uses large files # So let's make sure it can. # # Note: You don't need to use this if you use upstart.conf to start # stenographer with upstart, since it sets its own limits. # Allow files up to 4G stenographer - fsize 4194304 # Allow unlimited open files stenographer - nofile 1000000 stenographer-1.0.1/configs/steno.conf000066400000000000000000000005731372346644400176630ustar00rootroot00000000000000{ "Threads": [ { "PacketsDirectory": "/path/to/thread0/packets/directory" , "IndexDirectory": "/path/to/thread0/index/directory" , "MaxDirectoryFiles": 30000 , "DiskFreePercentage": 10 } ] , "StenotypePath": "/usr/bin/stenotype" , "Interface": "em1" , "Port": 1234 , "Host": "127.0.0.1" , "Flags": [] , "CertPath": "/etc/stenographer/certs" } stenographer-1.0.1/configs/systemd.conf000066400000000000000000000020701372346644400202150ustar00rootroot00000000000000# Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # stenographer - full packet to disk capture # # stenographer is a simple, fast method of writing live packets to disk, # then requesting those packets after-the-fact for post-hoc analysis. [Unit] Description=packet capture to disk After=network.target [Service] User=stenographer Group=stenographer SyslogIdentifier=stenographer LimitFSIZE=4294967296 LimitNOFILE=1000000 ExecStart=/usr/bin/stenographer ExecStopPost=/bin/pkill -9 stenotype [Install] WantedBy=multi-user.target stenographer-1.0.1/configs/upstart.conf000066400000000000000000000021751372346644400202350ustar00rootroot00000000000000# Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # stenographer - full packet to disk capture # # stenographer is a simple, fast method of writing live packets to disk, # then requesting those packets after-the-fact for post-hoc analysis. description "packet capturing to disk" start on runlevel [2345] stop on runlevel [!2345] respawn respawn limit 1 300 # At most once per 5 minutes setuid stenographer setgid nogroup limit nofile 1000000 1000000 limit fsize 4294967296 4294967296 exec /usr/bin/stenographer post-stop script /bin/sleep 15 /usr/bin/killall -9 stenotype || true end script stenographer-1.0.1/env/000077500000000000000000000000001372346644400150175ustar00rootroot00000000000000stenographer-1.0.1/env/env.go000066400000000000000000000263261372346644400161470ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package env contains the environment that stenographer will set up and run. // This is the main part of the stenographer server, setting up stenotype's // environment and running it, and serving all HTTP requests. package env import ( "encoding/json" "fmt" "io" "io/ioutil" "log" "net/http" "os" "os/exec" "path/filepath" "strings" "time" "github.com/google/stenographer/base" "github.com/google/stenographer/certs" "github.com/google/stenographer/config" "github.com/google/stenographer/filecache" "github.com/google/stenographer/httputil" "github.com/google/stenographer/query" "github.com/google/stenographer/stats" "github.com/google/stenographer/thread" "golang.org/x/net/context" ) var ( v = base.V // verbose logging rmHiddenFiles = stats.S.Get("removed_hidden_files") rmMismatchFiles = stats.S.Get("removed_mismatched_files") ) const ( fileSyncFrequency = 15 * time.Second // These files will be read from Config.CertPath. // Use stenokeys.sh to generate them. caCertFilename = "ca_cert.pem" serverCertFilename = "server_cert.pem" serverKeyFilename = "server_key.pem" ) // Serve starts up an HTTP server using http.DefaultServerMux to handle // requests. This server will server over TLS, using the certs // stored in c.CertPath to verify itself to clients and verify clients. func (e *Env) Serve() error { tlsConfig, err := certs.ClientVerifyingTLSConfig(filepath.Join(e.conf.CertPath, caCertFilename)) if err != nil { return fmt.Errorf("cannot verify client cert: %v", err) } server := &http.Server{ Addr: fmt.Sprintf("%s:%d", e.conf.Host, e.conf.Port), TLSConfig: tlsConfig, } http.HandleFunc("/query", e.handleQuery) http.Handle("/debug/stats", stats.S) return server.ListenAndServeTLS( filepath.Join(e.conf.CertPath, serverCertFilename), filepath.Join(e.conf.CertPath, serverKeyFilename)) } func (e *Env) handleQuery(w http.ResponseWriter, r *http.Request) { w = httputil.Log(w, r, true) defer log.Print(w) limit, err := base.LimitFromHeaders(r.Header) if err != nil { http.Error(w, "Invalid Limit Headers", http.StatusBadRequest) return } queryBytes, err := ioutil.ReadAll(r.Body) if err != nil { http.Error(w, "could not read request body", http.StatusBadRequest) return } q, err := query.NewQuery(string(queryBytes)) if err != nil { http.Error(w, "could not parse query", http.StatusBadRequest) return } ctx := httputil.Context(w, r, time.Minute*15) defer ctx.Cancel() packets := e.Lookup(ctx, q) w.Header().Set("Content-Type", "application/octet-stream") base.PacketsToFile(packets, w, limit) } // New returns a new Env for use in running Stenotype. func New(c config.Config) (_ *Env, returnedErr error) { if err := c.Validate(); err != nil { return nil, err } dirname, err := ioutil.TempDir("", "stenographer") if err != nil { return nil, fmt.Errorf("couldn't create temp directory: %v", err) } defer func() { // If this fails, remove the temp dir. if returnedErr != nil { os.RemoveAll(dirname) } }() threads, err := thread.Threads(c.Threads, dirname, filecache.NewCache(c.MaxOpenFiles)) if err != nil { return nil, err } d := &Env{ conf: c, name: dirname, threads: threads, done: make(chan bool), } go d.callEvery(d.syncFiles, fileSyncFrequency) return d, nil } // args is the set of command line arguments to pass to stentype. func (d *Env) args() []string { res := append(d.conf.Flags, fmt.Sprintf("--threads=%d", len(d.conf.Threads)), fmt.Sprintf("--dir=%s", d.Path())) if len(d.conf.Interface) > 0 { res = append(res, fmt.Sprintf("--iface=%s", d.conf.Interface)) } if len(d.conf.TestimonySocket) > 0 { res = append(res, fmt.Sprintf("--testimony=%s", d.conf.TestimonySocket)) } return res } // stenotype returns a exec.Cmd which runs the stenotype binary with all of // the appropriate flags. func (d *Env) stenotype() *exec.Cmd { v(0, "Starting stenotype") args := d.args() v(1, "Starting as %q with args %q", d.conf.StenotypePath, args) return exec.Command(d.conf.StenotypePath, args...) } // Env contains information necessary to run Stenotype. type Env struct { conf config.Config name string threads []*thread.Thread done chan bool fc *filecache.Cache // StenotypeOutput is the writer that stenotype STDOUT/STDERR will be // redirected to. StenotypeOutput io.Writer } // Close closes the directory. This should only be done when stenotype has // stopped using it. After this call, Env should no longer be used. func (d *Env) Close() error { return os.RemoveAll(d.name) } func (d *Env) callEvery(cb func(), freq time.Duration) { ticker := time.NewTicker(freq) defer ticker.Stop() cb() // Call function immediately the first time around. for { select { case <-d.done: return case <-ticker.C: cb() } } } func removeHiddenFilesFrom(dir string) { files, err := ioutil.ReadDir(dir) if err != nil { log.Printf("Hidden file cleanup failed, could not read directory: %v", err) return } for _, file := range files { if file.Mode().IsRegular() && strings.HasPrefix(file.Name(), ".") { filename := filepath.Join(dir, file.Name()) if err := os.Remove(filename); err != nil { log.Printf("Unable to remove hidden file %q: %v", filename, err) } else { rmHiddenFiles.Increment() log.Printf("Deleted stale output file %q", filename) } } } } func filesIn(dir string) (map[string]os.FileInfo, error) { files, err := ioutil.ReadDir(dir) if err != nil { return nil, err } out := map[string]os.FileInfo{} for _, file := range files { if file.Mode().IsRegular() { out[file.Name()] = file } } return out, nil } // removeOldFiles removes hidden files from previous runs, as well as packet // files without indexes and vice versa. func (d *Env) removeOldFiles() { for _, thread := range d.conf.Threads { v(1, "Checking %q/%q for stale pkt/idx files...", thread.PacketsDirectory, thread.IndexDirectory) removeHiddenFilesFrom(thread.PacketsDirectory) removeHiddenFilesFrom(thread.IndexDirectory) packetFiles, err := filesIn(thread.PacketsDirectory) if err != nil { log.Printf("could not get files from %q: %v", thread.PacketsDirectory, err) continue } indexFiles, err := filesIn(thread.IndexDirectory) if err != nil { log.Printf("could not get files from %q: %v", thread.IndexDirectory, err) continue } var mismatchedFilesToRemove []string for file := range packetFiles { if indexFiles[file] == nil { mismatchedFilesToRemove = append(mismatchedFilesToRemove, filepath.Join(thread.PacketsDirectory, file)) log.Printf("Removing packet file %q without index found in %q", file, thread.PacketsDirectory) } } for file := range indexFiles { if packetFiles[file] == nil { mismatchedFilesToRemove = append(mismatchedFilesToRemove, filepath.Join(thread.IndexDirectory, file)) log.Printf("Removing index file %q without packets found in %q", file, thread.IndexDirectory) } } for _, file := range mismatchedFilesToRemove { v(2, "Removing file %q", file) if err := os.Remove(file); err != nil { log.Printf("Unable to remove mismatched file %q", file) } else { rmMismatchFiles.Increment() } } } } func (d *Env) syncFiles() { for _, t := range d.threads { t.SyncFiles() } } // Path returns the underlying directory path for the given Env. func (d *Env) Path() string { return d.name } // Lookup looks up the given query in all blockfiles currently known in this // Env. func (d *Env) Lookup(ctx context.Context, q query.Query) *base.PacketChan { var inputs []*base.PacketChan for _, thread := range d.threads { inputs = append(inputs, thread.Lookup(ctx, q)) } return base.MergePacketChans(ctx, inputs) } // ExportDebugHandlers exports a few debugging handlers to an HTTP ServeMux. func (d *Env) ExportDebugHandlers(mux *http.ServeMux) { mux.HandleFunc("/debug/config", func(w http.ResponseWriter, r *http.Request) { w = httputil.Log(w, r, false) defer log.Print(w) w.Header().Set("Content-Type", "application/json") json.NewEncoder(w).Encode(d.conf) }) for _, thread := range d.threads { thread.ExportDebugHandlers(mux) } oldestTimestamp := stats.S.Get("oldest_timestamp") go func() { for c := time.Tick(time.Second * 10); ; <-c { t := time.Unix(0, 0) for _, thread := range d.threads { ts := thread.OldestFileTimestamp() if ts.After(t) { t = ts } } oldestTimestamp.Set(t.UnixNano()) } }() } // MinLastFileSeen returns the timestamp of the oldest among the newest files // created by all threads. func (d *Env) MinLastFileSeen() time.Time { var t time.Time for _, thread := range d.threads { ls := thread.FileLastSeen() if t.IsZero() || ls.Before(t) { t = ls } } return t } // runStaleFileCheck watches files generated by stenotype to make sure it's // regularly generating new files. It will Kill() stenotype if it doesn't see // at least one new file every maxFileLastSeenDuration in each thread directory. func (d *Env) runStaleFileCheck(cmd *exec.Cmd, done chan struct{}) { ticker := time.NewTicker(maxFileLastSeenDuration) defer ticker.Stop() for { select { case <-ticker.C: v(2, "Checking stenotype for stale files...") diff := time.Now().Sub(d.MinLastFileSeen()) if diff > maxFileLastSeenDuration { log.Printf("Restarting stenotype due to stale file. Age: %v", diff) if err := cmd.Process.Kill(); err != nil { log.Fatalf("Failed to kill stenotype, stale file found: %v", err) } } else { v(2, "Stenotype up to date, last file update %v ago", diff) } case <-done: return } } } const ( minStenotypeRuntimeForRestart = time.Minute maxFileLastSeenDuration = time.Minute * 5 ) // runStenotypeOnce runs the stenotype binary a single time, returning any // errors associated with its running. func (d *Env) runStenotypeOnce() error { d.removeOldFiles() cmd := d.stenotype() done := make(chan struct{}) defer close(done) // Start running stenotype. cmd.Stdout = d.StenotypeOutput cmd.Stderr = d.StenotypeOutput if err := cmd.Start(); err != nil { return fmt.Errorf("cannot start stenotype: %v", err) } go d.runStaleFileCheck(cmd, done) if err := cmd.Wait(); err != nil { return fmt.Errorf("stenotype wait failed: %v", err) } return fmt.Errorf("stenotype stopped") } // RunStenotype keeps the stenotype binary running, restarting it if necessary // but trying not to allow crash loops. func (d *Env) RunStenotype() { for { start := time.Now() v(1, "Running Stenotype") err := d.runStenotypeOnce() duration := time.Since(start) log.Printf("Stenotype stopped after %v: %v", duration, err) if duration < minStenotypeRuntimeForRestart { log.Fatalf("Stenotype ran for too little time, crashing to avoid stenotype crash loop") } } } stenographer-1.0.1/filecache/000077500000000000000000000000001372346644400161325ustar00rootroot00000000000000stenographer-1.0.1/filecache/filecache.go000066400000000000000000000077051372346644400203750ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package filecache provides a LRU cache of open os.File objects, closing old // files as new ones are opened. package filecache import ( "fmt" "os" "sync" "time" "github.com/google/stenographer/base" ) var v = base.V type CachedFile struct { cache *Cache // protected by cache.mu at time.Time prev, next *CachedFile mu sync.RWMutex // protected by mu filename string f *os.File } func NewCache(maxOpened int) *Cache { if maxOpened < 1 { panic("maxOpened must be > 0") } return &Cache{maxOpened: maxOpened} } type Cache struct { mu sync.Mutex first, last *CachedFile opened, maxOpened int } func (cf *CachedFile) moveToFront() { cf.at = time.Now() if cf.cache.first == cf { return } // Remove from current place in list if cf.prev != nil { cf.prev.next = cf.next } if cf.next != nil { cf.next.prev = cf.prev } // Update last element in list, if necessary if cf.cache.last == nil { cf.cache.last = cf } else if cf.cache.last == cf { cf.cache.last = cf.prev } // Update first element in list if cf.cache.first != nil { cf.next = cf.cache.first cf.cache.first.prev = cf } cf.cache.first = cf cf.prev = nil } func (c *Cache) Open(filename string) *CachedFile { v(3, "Deferring open of %q", filename) return &CachedFile{cache: c, filename: filename} } func (cf *CachedFile) readLockedFile() error { cf.cache.mu.Lock() cf.moveToFront() cf.cache.mu.Unlock() for { cf.mu.RLock() if cf.f != nil { return nil } cf.mu.RUnlock() if err := cf.openFile(); err != nil { return fmt.Errorf("lazily opening: %v", err) } } } func (cf *CachedFile) ReadAt(p []byte, off int64) (int, error) { if err := cf.readLockedFile(); err != nil { return 0, err } defer cf.mu.RUnlock() return cf.f.ReadAt(p, off) } func (cf *CachedFile) Read(p []byte) (int, error) { if err := cf.readLockedFile(); err != nil { return 0, err } defer cf.mu.RUnlock() return cf.f.Read(p) } func (cf *CachedFile) Stat() (os.FileInfo, error) { if err := cf.readLockedFile(); err != nil { return nil, err } defer cf.mu.RUnlock() return cf.f.Stat() } func (cf *CachedFile) Write(p []byte) (int, error) { return 0, fmt.Errorf("cached file not writable") } func (cf *CachedFile) Sync() error { return fmt.Errorf("cached file not syncable") } func (cf *CachedFile) openFile() error { cf.cache.mu.Lock() defer cf.cache.mu.Unlock() cf.mu.Lock() defer cf.mu.Unlock() if cf.f != nil { return nil } v(2, "Opening %q", cf.filename) newF, err := os.Open(cf.filename) if err != nil { v(1, "Open of %q failed: %v", cf.filename, err) return err } cf.f = newF cf.moveToFront() cf.cache.opened++ for cf.cache.opened > cf.cache.maxOpened { v(3, "Cached files above max, closing last") oldLast := cf.cache.last cf.cache.last = oldLast.prev if cf.cache.last != nil { cf.cache.last.next = nil } oldLast.prev = nil oldLast.next = nil oldLast.closeFile() } return nil } func (cf *CachedFile) Close() error { // Remove me from the cache, so others can't reference me cf.cache.mu.Lock() defer cf.cache.mu.Unlock() cf.mu.Lock() defer cf.mu.Unlock() return cf.closeFile() } func (cf *CachedFile) closeFile() error { if cf.f == nil { v(3, "Close of already-closed file %q ignored", cf.filename) return nil } v(2, "Closing %q", cf.filename) cf.cache.opened-- f := cf.f cf.f = nil return f.Close() } stenographer-1.0.1/filecache/filecache_test.go000066400000000000000000000025631372346644400214310ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. package filecache import ( "fmt" "io/ioutil" "os" "path/filepath" "testing" ) func TestCache(t *testing.T) { d, err := ioutil.TempDir("", "filecache_test") if err != nil { t.Fatal(err) } defer os.RemoveAll(d) paths := make([]string, 100) for i := 0; i < 100; i++ { intstr := fmt.Sprintf("%d", i) paths[i] = filepath.Join(d, intstr) if err := ioutil.WriteFile(paths[i], []byte(intstr), 0600); err != nil { t.Fatal(err) } } c := NewCache(10) var b [1]byte for i := 0; i < 100; i++ { if _, err := c.Open(paths[i]).ReadAt(b[:], 0); err != nil { t.Fatalf("opening/reading %q: %v", paths[i], err) } } for i := 0; i < 100; i++ { if _, err := c.Open(paths[i]).ReadAt(b[:], 0); err != nil { t.Fatalf("opening/reading %q: %v", paths[i], err) } } } stenographer-1.0.1/format.sh000077500000000000000000000012471372346644400160620ustar00rootroot00000000000000#!/bin/bash # Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. cd $(dirname $0) clang-format-3.5 -style=Google -i stenotype/*.{cc,h} stenographer-1.0.1/httputil/000077500000000000000000000000001372346644400161045ustar00rootroot00000000000000stenographer-1.0.1/httputil/httputil.go000066400000000000000000000063421372346644400203150ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package httputil provides http utilities for stenographer. package httputil import ( "bytes" "fmt" "io" "io/ioutil" "log" "net/http" "strings" "time" "github.com/google/stenographer/base" "github.com/google/stenographer/stats" ) // Context returns a new context.Content that cancels when the // underlying http.ResponseWriter closes. func Context(w http.ResponseWriter, r *http.Request, timeout time.Duration) base.Context { ctx := base.NewContext(timeout) if closer, ok := w.(http.CloseNotifier); ok { go func() { select { case <-closer.CloseNotify(): log.Printf("Detected closed HTTP connection, canceling query") ctx.Cancel() case <-ctx.Done(): } }() } return ctx } type httpLog struct { r *http.Request w http.ResponseWriter nBytes int code int err error start time.Time body string } // New returns a new ResponseWriter which provides a nice // String() method for easy printing. The expected usage is: // func (h *myHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { // w = httputil.Log(w, r, false) // defer log.Print(w) // Prints out useful information about request AND response // ... do stuff ... // } func Log(w http.ResponseWriter, r *http.Request, logRequestBody bool) http.ResponseWriter { h := &httpLog{w: w, r: r, start: time.Now(), code: http.StatusOK} if logRequestBody { var buf bytes.Buffer _, h.err = io.Copy(&buf, r.Body) r.Body.Close() r.Body = ioutil.NopCloser(&buf) h.body = fmt.Sprintf(" RequestBody:%q", buf.String()) } return h } // Header implements http.ResponseWriter. func (h *httpLog) Header() http.Header { return h.w.Header() } // Write implements http.ResponseWriter and io.Writer. func (h *httpLog) Write(data []byte) (int, error) { n, err := h.w.Write(data) h.nBytes += n if err != nil && h.err == nil { h.err = err } return n, err } // WriteHeader implements http.ResponseWriter. func (h *httpLog) WriteHeader(code int) { h.code = code h.w.WriteHeader(code) } // String implements fmt.Stringer. func (h *httpLog) String() string { var errstr string if h.err != nil { errstr = h.err.Error() } duration := time.Since(h.start) prefix := "http_request" + strings.Replace(h.r.URL.Path, "/", "_", -1) + "_" + h.r.Method + "_" stats.S.Get(prefix + "completed").Increment() stats.S.Get(prefix + "nanos").IncrementBy(duration.Nanoseconds()) stats.S.Get(prefix + "bytes").IncrementBy(int64(h.nBytes)) return fmt.Sprintf("Requester:%q Request:\"%v %v %v\" Time:%v Bytes:%v Code:%q Err:%q%v", h.r.RemoteAddr, h.r.Method, h.r.URL, h.r.Proto, duration, h.nBytes, http.StatusText(h.code), errstr, h.body) } stenographer-1.0.1/indexfile/000077500000000000000000000000001372346644400161765ustar00rootroot00000000000000stenographer-1.0.1/indexfile/indexfile.go000066400000000000000000000160351372346644400205010ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // Package indexfile provides methods for querying stenotype indexes to find the // blockfile positions of packets. package indexfile import ( "bytes" "encoding/binary" "encoding/hex" "fmt" "io" "net" "strings" "github.com/golang/leveldb/table" "github.com/google/stenographer/base" "github.com/google/stenographer/filecache" "github.com/google/stenographer/stats" "golang.org/x/net/context" ) var ( v = base.V // verbose logging locally. indexReadNanos = stats.S.Get("indexfile_read_nanos") indexReads = stats.S.Get("indexfile_reads") indexCurrentReads = stats.S.Get("indexfile_current_reads") ) // Major version number of the file format that we support. const majorVersionNumber = 2 // IndexFile wraps a stenotype index, allowing it to be queried. type IndexFile struct { name string ss *table.Reader } // IndexPathFromBlockfilePath returns the path to an index file based on the path to a // block file. func IndexPathFromBlockfilePath(p string) string { return strings.Replace(p, "PKT", "IDX", 1) } // BlockfilePathFromIndexPath returns the path to a block file based on the path to an // index file. func BlockfilePathFromIndexPath(p string) string { return strings.Replace(p, "IDX", "PKT", 1) } // NewIndexFile returns a new handle to the named index file. func NewIndexFile(filename string, fc *filecache.Cache) (*IndexFile, error) { v(1, "opening index %q", filename) ss := table.NewReader(fc.Open(filename), nil) if versions, err := ss.Get([]byte{0}, nil); err != nil { return nil, fmt.Errorf("invalid index file %q missing versions record: %v", filename, err) } else if len(versions) != 8 { return nil, fmt.Errorf("invalid index file %q invalid versions record: %v", filename, versions) } else if major, minor := binary.BigEndian.Uint32(versions[:4]), binary.BigEndian.Uint32(versions[4:]); major != majorVersionNumber { return nil, fmt.Errorf("invalid index file %q: version mismatch, want %d got %d", filename, majorVersionNumber, major) } else { v(3, "index file %q has file format version %d:%d", filename, major, minor) } if *base.VerboseLogging >= 10 { iter := ss.Find([]byte{}, nil) v(4, "=== %q ===", filename) for iter.Next() { v(4, " %v", iter.Key()) } v(4, " ERR: %v", iter.Close()) } index := &IndexFile{ss: ss, name: filename} return index, nil } // Name returns the name of the file underlying this index. func (i *IndexFile) Name() string { return i.name } // IPPositions returns the positions in the block file of all packets with IPs // between the given ranges. Both IPs must be 4 or 16 bytes long, both must be // the same length, and from must be <= to. func (i *IndexFile) IPPositions(ctx context.Context, from, to net.IP) (base.Positions, error) { var version byte switch { case len(from) != len(to): return nil, fmt.Errorf("IP length mismatch") case bytes.Compare(from, to) > 0: return nil, fmt.Errorf("from IP greater than to IP") case len(from) == 16: version = 6 case len(from) == 4: version = 4 default: return nil, fmt.Errorf("Invalid IP length") } return i.positions( ctx, append([]byte{version}, []byte(from)...), append([]byte{version}, []byte(to)...)) } // ProtoPositions returns the positions in the block file of all packets with // the give IP protocol number. func (i *IndexFile) ProtoPositions(ctx context.Context, proto byte) (base.Positions, error) { return i.positionsSingleKey(ctx, []byte{1, proto}) } // PortPositions returns the positions in the block file of all packets with // the give port number (TCP or UDP). func (i *IndexFile) PortPositions(ctx context.Context, port uint16) (base.Positions, error) { var buf [3]byte binary.BigEndian.PutUint16(buf[1:], port) buf[0] = 2 return i.positionsSingleKey(ctx, buf[:]) } // VLANPositions returns the positions in the block file of all packets with // the given VLAN number. func (i *IndexFile) VLANPositions(ctx context.Context, port uint16) (base.Positions, error) { var buf [3]byte binary.BigEndian.PutUint16(buf[1:], port) buf[0] = 3 return i.positionsSingleKey(ctx, buf[:]) } // MPLSPositions returns the positions in the block file of all packets with // the given MPLS number. func (i *IndexFile) MPLSPositions(ctx context.Context, mpls uint32) (base.Positions, error) { var buf [5]byte binary.BigEndian.PutUint32(buf[1:], mpls) buf[0] = 5 return i.positionsSingleKey(ctx, buf[:]) } // Dump writes out a debug version of the entire index to the given writer. func (i *IndexFile) Dump(out io.Writer, start, finish []byte) { for iter := i.ss.Find(start, nil); iter.Next() && bytes.Compare(iter.Key(), finish) <= 0; { fmt.Fprintf(out, "%v\n", hex.EncodeToString(iter.Key())) } } // positions returns a set of positions to look for packets, based on a // lookup of all blockfile positions stored between (inclusively) index // keys 'from' and 'to'. func (i *IndexFile) positions(ctx context.Context, from, to []byte) (out base.Positions, _ error) { v(4, "%q multi key iterator %v:%v start", i.name, from, to) if len(from) != len(to) { return nil, fmt.Errorf("invalid from/to lengths don't match: %v %v", from, to) } indexCurrentReads.Increment() defer func() { indexCurrentReads.IncrementBy(-1) indexReads.Increment() }() defer indexReadNanos.NanoTimer()() iter := i.ss.Find(from, nil) for iter.Next() && !base.ContextDone(ctx) { if to != nil && bytes.Compare(iter.Key(), to) > 0 { v(4, "%q multi key iterator %v:%v hit limit with %v", i.name, from, to, iter.Key()) break } current := make(base.Positions, len(iter.Value())/4) for i := 0; i < len(iter.Value()); i += 4 { current[i/4] = int64(binary.BigEndian.Uint32(iter.Value()[i : i+4])) } v(4, "%q multi key iterator got in-iter union of length %d for %v", i.name, len(current), iter.Key()) if out == nil { out = current } else { out = out.Union(current) } } if err := ctx.Err(); err != nil { v(4, "%q multi key iterator context err: %v", i.name, err) return nil, err } v(4, "%q multi key iterator done, got %d", i.name, len(out)) if err := iter.Close(); err != nil { v(4, "%q multi key iterator err=%v", i.name, err) return nil, err } return out, nil } func (i *IndexFile) positionsSingleKey(ctx context.Context, key []byte) (base.Positions, error) { return i.positions(ctx, key, key) } func parseIP(in string) net.IP { ip := net.ParseIP(in) if ip == nil { return nil } if ip4 := ip.To4(); ip4 != nil { ip = ip4 } return ip } // Close the indexfile. func (i *IndexFile) Close() error { return i.ss.Close() } stenographer-1.0.1/indexfile/indexfile_test.go000066400000000000000000000074221372346644400215400ustar00rootroot00000000000000// Copyright 2015 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. package indexfile import ( "bytes" "encoding/hex" "reflect" "testing" "golang.org/x/net/context" "github.com/google/stenographer/base" "github.com/google/stenographer/filecache" ) var ctx = context.Background() func testIndexFile(t *testing.T, filename string) *IndexFile { idx, err := NewIndexFile(filename, filecache.NewCache(10)) if err != nil { t.Fatal(err) } return idx } func TestIPPositions(t *testing.T) { idx := testIndexFile(t, "../testdata/IDX0/dhcp") defer idx.Close() for _, test := range []struct { start string end string want base.Positions }{ {"192.168.0.1", "192.168.0.254", base.Positions{1049024, 1049848}}, {"10.0.0.1", "10.0.0.254", nil}, } { if got, err := idx.IPPositions(ctx, parseIP(test.start), parseIP(test.end)); err != nil { t.Fatal(err) } else if !reflect.DeepEqual(got, test.want) { t.Errorf("wrong IP positions.\nwant: %v\n got: %v\n", test.want, got) } } } func TestMPLSPositions(t *testing.T) { idx := testIndexFile(t, "../testdata/IDX0/mpls") defer idx.Close() for _, test := range []struct { label uint32 want base.Positions }{ {29, base.Positions{1051144, 1054304, 1054592, 1054736, 1054888, 1055184, 1055328, 1055472, 1055800, 1056824, 1056968}}, {55, nil}, } { if got, err := idx.MPLSPositions(ctx, test.label); err != nil { t.Fatal(err) } else if !reflect.DeepEqual(got, test.want) { t.Errorf("wrong MPLS positions.\nwant: %v\n got: %v\n", test.want, got) } } } func TestVLANPositions(t *testing.T) { idx := testIndexFile(t, "../testdata/IDX0/vlan") defer idx.Close() for _, test := range []struct { id uint16 want base.Positions }{ {7, base.Positions{1123648, 1126248, 1178544, 1192552, 1208680}}, {8, nil}, } { if got, err := idx.VLANPositions(ctx, test.id); err != nil { t.Fatal(err) } else if !reflect.DeepEqual(got, test.want) { t.Errorf("wrong VLAN positions.\nwant: %v\n got: %v\n", test.want, got) } } } func TestProtoPositions(t *testing.T) { idx := testIndexFile(t, "../testdata/IDX0/dhcp") defer idx.Close() for _, test := range []struct { proto byte want base.Positions }{ {'\x11', base.Positions{1048624, 1049024, 1049448, 1049848}}, {'\x12', nil}, } { if got, err := idx.ProtoPositions(ctx, test.proto); err != nil { t.Fatal(err) } else if !reflect.DeepEqual(got, test.want) { t.Errorf("wrong proto positions.\nwant: %v\n got: %v\n", test.want, got) } } } func TestPortPositions(t *testing.T) { idx := testIndexFile(t, "../testdata/IDX0/dhcp") defer idx.Close() for _, test := range []struct { port uint16 want base.Positions }{ {67, base.Positions{1048624, 1049024, 1049448, 1049848}}, {69, nil}, } { if got, err := idx.PortPositions(ctx, test.port); err != nil { t.Fatal(err) } else if !reflect.DeepEqual(got, test.want) { t.Errorf("wrong proto positions.\nwant: %v\n got: %v\n", test.want, got) } } } func TestDump(t *testing.T) { idx := testIndexFile(t, "../testdata/IDX0/dhcp") want := "00\n0111\n013a\n" var w bytes.Buffer start, _ := hex.DecodeString("00") end, _ := hex.DecodeString("02") idx.Dump(&w, start, end) got := w.String() if got != want { t.Fatalf("invalid dump.\nwant %q\n got: %q\n", want, got) } } stenographer-1.0.1/install.sh000077500000000000000000000104271372346644400162400ustar00rootroot00000000000000#!/bin/bash # Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # This is not meant to be a permanent addition to stenographer, more of a # hold-over until we can get actual debian packaging worked out. Also, this # will probably guide the debian work by detailing all the actual stuff that # needs to be done to install stenographer correctly. BINDIR="${BINDIR-/usr/bin}" cd "$(dirname $0)" source lib.sh set -e Info "Making sure we have sudo access" sudo cat /dev/null InstallPackage libaio-dev InstallPackage libleveldb-dev InstallPackage libsnappy-dev InstallPackage g++ InstallPackage libcap2-bin InstallPackage libseccomp-dev InstallPackage jq InstallPackage openssl Info "Building stenographer" go build Info "Building stenotype" pushd stenotype make popd set +e Info "Killing aleady-running processes" sudo service stenographer stop ReallyKill stenographer ReallyKill stenotype set -e if ! id stenographer >/dev/null 2>&1; then Info "Setting up stenographer user" sudo adduser --system --no-create-home stenographer fi if ! getent group stenographer >/dev/null 2>&1; then Info "Setting up stenographer group" sudo addgroup --system stenographer fi if [ ! -f /etc/security/limits.d/stenographer.conf ]; then Info "Setting up stenographer limits" sudo cp -v configs/limits.conf /etc/security/limits.d/stenographer.conf fi if [ -d /etc/init/ ]; then if [ ! -f /etc/init/stenographer.conf ]; then Info "Setting up stenographer upstart config" sudo cp -v configs/upstart.conf /etc/init/stenographer.conf sudo chmod 0644 /etc/init/stenographer.conf fi fi if [ -d /lib/systemd/system/ ]; then if [ ! -f /lib/systemd/system/stenographer.service ]; then Info "Setting up stenographer systemd config" sudo cp -v configs/systemd.conf /lib/systemd/system/stenographer.service sudo chmod 644 /lib/systemd/system/stenographer.service fi fi if [ ! -d /etc/stenographer/certs ]; then Info "Setting up stenographer /etc directory" sudo mkdir -p /etc/stenographer/certs sudo chown -R root:root /etc/stenographer/certs if [ ! -f /etc/stenographer/config ]; then sudo cp -vf configs/steno.conf /etc/stenographer/config sudo chown root:root /etc/stenographer/config sudo chmod 644 /etc/stenographer/config fi sudo chown root:root /etc/stenographer fi if grep -q /path/to /etc/stenographer/config; then Error "Create directories to output packets/indexes to, then update" Error "/etc/stenographer/config to point to them." Error "Directories should be owned by stenographer:stenographer." exit 1 fi sudo ./stenokeys.sh stenographer stenographer Info "Copying stenographer/stenotype" sudo cp -vf stenographer "$BINDIR/stenographer" sudo chown stenographer:root "$BINDIR/stenographer" sudo chmod 0700 "$BINDIR/stenographer" sudo cp -vf stenotype/stenotype "$BINDIR/stenotype" sudo chown stenographer:root "$BINDIR/stenotype" sudo chmod 0500 "$BINDIR/stenotype" SetCapabilities "$BINDIR/stenotype" Info "Copying stenoread/stenocurl" sudo cp -vf stenoread "$BINDIR/stenoread" sudo chown root:root "$BINDIR/stenoread" sudo chmod 0755 "$BINDIR/stenoread" sudo cp -vf stenocurl "$BINDIR/stenocurl" sudo chown root:root "$BINDIR/stenocurl" sudo chmod 0755 "$BINDIR/stenocurl" Info "Starting stenographer using upstart" # If you're not using upstart, you can replace this with: # sudo -b -u stenographer $BINDIR/stenographer & sudo service stenographer start Info "Checking for running processes..." sleep 5 if Running stenographer; then Info " * Stenographer up and running" else Error " !!! Stenographer not running !!!" tail -n 100 /var/log/messages | grep steno exit 1 fi if Running stenotype; then Info " * Stenotype up and running" else Error " !!! Stenotype not running !!!" tail -n 100 /var/log/messages | grep steno exit 1 fi stenographer-1.0.1/install_el7.sh000077500000000000000000000133701372346644400170070ustar00rootroot00000000000000#!/usr/bin/env bash # Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # stenographer - full packet to disk capture # # stenographer is a simple, fast method of writing live packets to disk, # then requesting those packets after-the-fact for post-hoc analysis. #===============================================================# # Installs Stenographer on CentOS 7.1 #===============================================================# export KILLCMD=/usr/bin/pkill export BINDIR="${BINDIR-/usr/bin}" export GOPATH=${HOME}/go export PATH=${PATH}:/usr/local/go/bin # Load support functions _scriptDir="$(dirname `readlink -f $0`)" source lib.sh check_sudo () { Info "Checking for sudo... " if (! sudo cat /dev/null); then Error "Failed. Please configure sudo support for this user." exit 1; fi } stop_processes () { Info "Killing any already running processes..." sudo service stenographer stop ReallyKill stenographer ReallyKill stenotype } install_packages () { Info "Installing stenographer package requirements... " sudo yum install -y epel-release; sudo yum makecache sudo yum install -y libaio-devel leveldb-devel snappy-devel gcc-c++ make libcap-devel libseccomp-devel &>/dev/null if [ $? -ne 0 ]; then Error "Error. Please check that yum can install needed packages." exit 2; fi } install_golang () { local _url="https://storage.googleapis.com/golang/go1.6.3.linux-amd64.tar.gz" if (! which go &>/dev/null ); then Info "Installing golang ..." TMP="$(mktemp -d)" pushd $TMP curl -L -O -J -s $_url sudo tar -C /usr/local -zxf $(basename $_url) sudo tee /etc/profile.d/golang.sh >/dev/null << EOF pathmunge /usr/local/go/bin export GOPATH=\${HOME}/go EOF popd fi } # Install jq, if not present install_jq () { local _url="https://github.com/stedolan/jq/releases/download/jq-1.5rc2/jq-linux-x86_64" if (! which jq &>/dev/null); then Info "Installing jq ..." curl -s -L -J $_url | sudo tee /usr/local/bin/jq >/dev/null; sudo chmod +x /usr/local/bin/jq; fi } add_accounts () { if ! id stenographer &>/dev/null; then Info "Setting up stenographer user" sudo adduser --system --no-create-home stenographer fi if ! getent group stenographer &>/dev/null; then Info "Setting up stenographer group" sudo addgroup --system stenographer fi } install_configs () { cd $_scriptDir Info "Setting up stenographer conf directory" if [ ! -d /etc/stenographer/certs ]; then sudo mkdir -p /etc/stenographer/certs sudo chown -R root:root /etc/stenographer/certs fi if [ ! -f /etc/stenographer/config ]; then sudo cp -vf configs/steno.conf /etc/stenographer/config sudo chown root:root /etc/stenographer/config sudo chmod 644 /etc/stenographer/config fi sudo chown root:root /etc/stenographer if grep -q /path/to /etc/stenographer/config; then Error "Create output directories for packets/index, then update" Error "/etc/stenographer/config" exit 1 fi } install_certs () { cd $_scriptDir sudo ./stenokeys.sh stenographer stenographer } install_service () { cd $_scriptDir if [ ! -f /etc/security/limits.d/stenographer.conf ]; then Info "Setting up stenographer limits" sudo cp -v configs/limits.conf /etc/security/limits.d/stenographer.conf fi if [ ! -f /etc/systemd/system/stenographer.service ]; then Info "Installing stenographer systemd service" sudo cp -v configs/systemd.conf /etc/systemd/system/stenographer.service sudo chmod 0644 /etc/systemd/system/stenographer.service fi } build_stenographer () { if [ ! -x "$BINDIR/stenographer" ]; then Info "Building/Installing stenographer" /usr/local/go/bin/go get ./... /usr/local/go/bin/go build sudo cp -vf stenographer "$BINDIR/stenographer" sudo chown stenographer:root "$BINDIR/stenographer" sudo chmod 700 "$BINDIR/stenographer" else Info "stenographer already exists at $BINDIR/stenographer. Skipping" fi } build_stenotype () { cd ${_scriptDir} if [ ! -x "$BINDIR/stenotype" ]; then Info "Building/Installing stenotype" pushd ${_scriptDir}/stenotype make popd sudo cp -vf stenotype/stenotype "$BINDIR/stenotype" sudo chown stenographer:root "$BINDIR/stenotype" sudo chmod 0500 "$BINDIR/stenotype" SetCapabilities "$BINDIR/stenotype" else Info "stenotype already exists at $BINDIR/stenotype. Skipping" fi } install_stenoread () { Info "Installing stenoread/stenocurl" sudo cp -vf stenoread "$BINDIR/stenoread" sudo chown root:root "$BINDIR/stenoread" sudo chmod 0755 "$BINDIR/stenoread" sudo cp -vf stenocurl "$BINDIR/stenocurl" sudo chown root:root "$BINDIR/stenocurl" sudo chmod 0755 "$BINDIR/stenocurl" } start_service () { Info "Starting stenographer service" sudo service stenographer start Info "Checking for running processes..." sleep 5 if Running stenographer; then Info " * Stenographer up and running" else Error " !!! Stenographer not running !!!" sudo tail -n 100 /var/log/messages | grep steno exit 1 fi if Running stenotype; then Info " * Stenotype up and running" else Error " !!! Stenotype not running !!!" sudo tail -n 100 /var/log/messages | grep steno exit 1 fi } check_sudo install_packages install_golang add_accounts build_stenographer build_stenotype install_jq install_configs install_certs install_service install_stenoread stop_processes start_service stenographer-1.0.1/integration_test/000077500000000000000000000000001372346644400176115ustar00rootroot00000000000000stenographer-1.0.1/integration_test/test.sh000077500000000000000000000106471372346644400211370ustar00rootroot00000000000000#!/bin/bash # Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. DUMMY="${DUMMY-dummy0}" PORT="${PORT-9123}" BASEDIR="${BASEDIR-/tmp}" SKIP_CLEANUP="${SKIP_CLEANUP}" SANITIZE="${SANITIZE}" PCAP_URL="https://archive.ll.mit.edu/ideval/data/2000/LLS_DDOS_1.0/data_and_labeling/tcpdump_inside/LLS_DDOS_1.0-inside.dump.gz" set -e cd $(dirname $0) source ../lib.sh function PullDownTestData { if [ ! -f $BASEDIR/steno_integration_test.pcap ]; then Info "Pulling down pcap data" curl -k -L "$PCAP_URL" > $BASEDIR/steno_integration_test.pcap.gz gunzip $BASEDIR/steno_integration_test.pcap.gz fi } function TestCountPackets { FILTER="$1" WANT="$2" Info "Looking $WANT packets from filter '$FILTER'" GOT="$(STENOGRAPHER_CONFIG="$OUTDIR/config" ../stenoread "$FILTER" -n | wc -l)" if [ "$GOT" != "$WANT" ]; then Error " - FAILED for filter '$FILTER': Want: $WANT Got: $GOT" exit 1 else Info " - SUCCESS: Got $GOT packets from filter '$FILTER'" fi } Info "Testing sudo access" sudo cat /dev/null InstallPackage tcpreplay PullDownTestData Info "Building stenographer" pushd ../ go build pushd stenotype make SANITIZE=$SANITIZE SetCapabilities stenotype STENOTYPE_BIN="$(pwd)/stenotype" popd popd Info "Setting up output directory" OUTDIR="$(mktemp -d $BASEDIR/steno.XXXXXXXXXX)" /bin/chmod g+rx "$OUTDIR" Info "Writing output to directory '$OUTDIR'" mkdir $OUTDIR/{pkt,idx,certs} Info "Setting up $DUMMY interface" sudo /sbin/modprobe dummy sudo ip link add $DUMMY type dummy || Error "$DUMMY may already exist" sudo ifconfig $DUMMY promisc up set +e STENOGRAPHER_PID="" STENOTYPE_PID="" function CleanUp { if [ -z "$SKIP_CLEANUP" ]; then Info "Cleaning up" if [ ! -z "$STENOGRAPHER_PID" ]; then Info "Killing stenographer ($STENOGRAPHER_PID)" KILLCMD=kill ReallyKill $STENOGRAPHER_PID fi if [ ! -z "$STENOTYPE_PID" ]; then Info "Killing stenotype ($STENOTYPE_PID)" KILLCMD=kill ReallyKill $STENOTYPE_PID fi Info "Deleting $DUMMY interface" sudo ifconfig $DUMMY down sudo ip link del dummy0 Info "--- LOG ---" /bin/cat "$OUTDIR/log" Info "Removing $OUTDIR" sudo find "$OUTDIR" -ls rm -rfv "$OUTDIR" fi } trap CleanUp EXIT cat > $OUTDIR/config << EOF { "Threads": [ { "PacketsDirectory": "$OUTDIR/pkt" , "IndexDirectory": "$OUTDIR/idx" , "DiskFreePercentage": 1 } ] , "StenotypePath": "$STENOTYPE_BIN" , "Interface": "$DUMMY" , "Port": $PORT , "Host": "127.0.0.1" , "Flags": ["-v", "-v", "-v", "--filesize_mb=16", "--aiops=16"] , "CertPath": "$OUTDIR/certs" } EOF Info "Setting up certs" CURR_USR="$(id -u -n)" CURR_GRP="$(id -g -n)" STENOGRAPHER_CONFIG="$OUTDIR/config" ../stenokeys.sh $CURR_USR $CURR_GRP Info "Starting stenographer" ../stenographer --config=$OUTDIR/config --syslog=false --v=4 >$OUTDIR/log 2>&1 & STENOGRAPHER_PID="$!" xterm -e "tail -f $OUTDIR/log" & Info "Waiting for stenographer to start up" Sleep 15 STENOTYPE_PID="$(ps axww | grep -v grep | grep -v Z | grep $STENOTYPE_BIN | awk '{print $1}')" if [ -z "$STENOTYPE_PID" ]; then Error "Stenotype not running" exit 1 fi Info "Sending packets to $DUMMY" sudo tcpreplay -i $DUMMY --topspeed $BASEDIR/steno_integration_test.pcap Sleep 80 Info "Looking for packets" TestCountPackets "port 21582" 1018 TestCountPackets "host 0.100.194.86" 2 TestCountPackets "net 0.0.0.0/8" 580 TestCountPackets "net 172.0.0.0/8 and port 23" 292041 Info "Sending packets to $DUMMY a second time" sudo tcpreplay -i $DUMMY --topspeed $BASEDIR/steno_integration_test.pcap Sleep 80 Info "Looking for packets a second time, in parallel" TESTPIDS="" TestCountPackets "port 21582" 2036 & TESTPIDS="$TESTPIDS $!" TestCountPackets "host 0.100.194.86" 4 & TESTPIDS="$TESTPIDS $!" TestCountPackets "net 0.0.0.0/8" 1160 & TESTPIDS="$TESTPIDS $!" TestCountPackets "net 172.0.0.0/8 and port 23" 584082 & TESTPIDS="$TESTPIDS $!" wait $TESTPIDS Info "Done" stenographer-1.0.1/lib.sh000066400000000000000000000030661372346644400153360ustar00rootroot00000000000000#!/bin/bash # Copyright 2014 Google Inc. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Some simple helper functions for writing bash scripts. function Info { echo -e -n '\e[7m' echo "$@" echo -e -n '\e[0m' } function Error { echo -e -n '\e[41m' echo "$@" echo -e -n '\e[0m' } function Kill { KILLCMD="${KILLCMD-killall}" sudo "$KILLCMD" "$@" 2>/dev/null >/dev/null } function Running { Kill -0 "$1" } function ReallyKill { if Running "$1"; then Info "Killing '$1'" Kill "$1" sleep 5 fi if Running "$1"; then Info "Killing '$1' again" Kill "$1" sleep 5 fi if Running "$1"; then Error "Killing '$1' with fire" Kill -9 "$1" sleep 1 fi } function InstallPackage { Info "Checking for package '$1'" if ! dpkg -s $1 >/dev/null 2>/dev/null; then Info "Have to install package $1" sudo apt-get install $1 fi } function SetCapabilities { sudo setcap 'CAP_NET_RAW+ep CAP_NET_ADMIN+ep CAP_IPC_LOCK+ep' "$1" } function Sleep { Info "Sleeping until $(date --date "$1 sec")" sleep $1 } stenographer-1.0.1/protobuf/000077500000000000000000000000001372346644400160675ustar00rootroot00000000000000stenographer-1.0.1/protobuf/steno.pb.go000066400000000000000000000203071372346644400201500ustar00rootroot00000000000000// Code generated by protoc-gen-go. DO NOT EDIT. // source: steno.proto package steno import ( context "context" fmt "fmt" proto "github.com/golang/protobuf/proto" grpc "google.golang.org/grpc" math "math" ) // Reference imports to suppress errors if they are not otherwise used. var _ = proto.Marshal var _ = fmt.Errorf var _ = math.Inf // This is a compile-time assertion to ensure that this generated file // is compatible with the proto package it is being compiled against. // A compilation error at this line likely means your copy of the // proto package needs to be updated. const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package type PcapRequest struct { Uid string `protobuf:"bytes,1,opt,name=uid,proto3" json:"uid,omitempty"` ChunkSize int64 `protobuf:"varint,2,opt,name=chunk_size,json=chunkSize,proto3" json:"chunk_size,omitempty"` MaxSize int64 `protobuf:"varint,3,opt,name=max_size,json=maxSize,proto3" json:"max_size,omitempty"` Query string `protobuf:"bytes,4,opt,name=query,proto3" json:"query,omitempty"` XXX_NoUnkeyedLiteral struct{} `json:"-"` XXX_unrecognized []byte `json:"-"` XXX_sizecache int32 `json:"-"` } func (m *PcapRequest) Reset() { *m = PcapRequest{} } func (m *PcapRequest) String() string { return proto.CompactTextString(m) } func (*PcapRequest) ProtoMessage() {} func (*PcapRequest) Descriptor() ([]byte, []int) { return fileDescriptor_a047459a1ab3dd2b, []int{0} } func (m *PcapRequest) XXX_Unmarshal(b []byte) error { return xxx_messageInfo_PcapRequest.Unmarshal(m, b) } func (m *PcapRequest) XXX_Marshal(b []byte, deterministic bool) ([]byte, error) { return xxx_messageInfo_PcapRequest.Marshal(b, m, deterministic) } func (m *PcapRequest) XXX_Merge(src proto.Message) { xxx_messageInfo_PcapRequest.Merge(m, src) } func (m *PcapRequest) XXX_Size() int { return xxx_messageInfo_PcapRequest.Size(m) } func (m *PcapRequest) XXX_DiscardUnknown() { xxx_messageInfo_PcapRequest.DiscardUnknown(m) } var xxx_messageInfo_PcapRequest proto.InternalMessageInfo func (m *PcapRequest) GetUid() string { if m != nil { return m.Uid } return "" } func (m *PcapRequest) GetChunkSize() int64 { if m != nil { return m.ChunkSize } return 0 } func (m *PcapRequest) GetMaxSize() int64 { if m != nil { return m.MaxSize } return 0 } func (m *PcapRequest) GetQuery() string { if m != nil { return m.Query } return "" } type PcapResponse struct { Uid string `protobuf:"bytes,1,opt,name=uid,proto3" json:"uid,omitempty"` Pcap []byte `protobuf:"bytes,2,opt,name=pcap,proto3" json:"pcap,omitempty"` XXX_NoUnkeyedLiteral struct{} `json:"-"` XXX_unrecognized []byte `json:"-"` XXX_sizecache int32 `json:"-"` } func (m *PcapResponse) Reset() { *m = PcapResponse{} } func (m *PcapResponse) String() string { return proto.CompactTextString(m) } func (*PcapResponse) ProtoMessage() {} func (*PcapResponse) Descriptor() ([]byte, []int) { return fileDescriptor_a047459a1ab3dd2b, []int{1} } func (m *PcapResponse) XXX_Unmarshal(b []byte) error { return xxx_messageInfo_PcapResponse.Unmarshal(m, b) } func (m *PcapResponse) XXX_Marshal(b []byte, deterministic bool) ([]byte, error) { return xxx_messageInfo_PcapResponse.Marshal(b, m, deterministic) } func (m *PcapResponse) XXX_Merge(src proto.Message) { xxx_messageInfo_PcapResponse.Merge(m, src) } func (m *PcapResponse) XXX_Size() int { return xxx_messageInfo_PcapResponse.Size(m) } func (m *PcapResponse) XXX_DiscardUnknown() { xxx_messageInfo_PcapResponse.DiscardUnknown(m) } var xxx_messageInfo_PcapResponse proto.InternalMessageInfo func (m *PcapResponse) GetUid() string { if m != nil { return m.Uid } return "" } func (m *PcapResponse) GetPcap() []byte { if m != nil { return m.Pcap } return nil } func init() { proto.RegisterType((*PcapRequest)(nil), "steno.PcapRequest") proto.RegisterType((*PcapResponse)(nil), "steno.PcapResponse") } func init() { proto.RegisterFile("steno.proto", fileDescriptor_a047459a1ab3dd2b) } var fileDescriptor_a047459a1ab3dd2b = []byte{ // 209 bytes of a gzipped FileDescriptorProto 0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0xff, 0x6c, 0x90, 0x4f, 0x4b, 0xc4, 0x30, 0x10, 0xc5, 0x8d, 0xdd, 0x55, 0x77, 0x36, 0x07, 0x19, 0x3d, 0x54, 0x41, 0x58, 0x7a, 0xea, 0xa9, 0x88, 0x7a, 0xf3, 0x23, 0x78, 0x91, 0xf4, 0x03, 0x48, 0xac, 0x83, 0x0d, 0xd2, 0x26, 0xcd, 0x1f, 0xa9, 0xfd, 0xf4, 0xd2, 0xc9, 0x45, 0x61, 0x6f, 0xf3, 0xde, 0x83, 0xf7, 0xcb, 0x0b, 0xec, 0x43, 0xa4, 0xd1, 0x36, 0xce, 0xdb, 0x68, 0x71, 0xcb, 0xa2, 0xb2, 0xb0, 0x7f, 0xed, 0xb4, 0x53, 0x34, 0x25, 0x0a, 0x11, 0x2f, 0xa1, 0x48, 0xe6, 0xa3, 0x14, 0x07, 0x51, 0xef, 0xd4, 0x7a, 0xe2, 0x1d, 0x40, 0xd7, 0xa7, 0xf1, 0xeb, 0x2d, 0x98, 0x85, 0xca, 0xd3, 0x83, 0xa8, 0x0b, 0xb5, 0x63, 0xa7, 0x35, 0x0b, 0xe1, 0x0d, 0x5c, 0x0c, 0x7a, 0xce, 0x61, 0xc1, 0xe1, 0xf9, 0xa0, 0x67, 0x8e, 0xae, 0x61, 0x3b, 0x25, 0xf2, 0x3f, 0xe5, 0x86, 0xdb, 0xb2, 0xa8, 0x9e, 0x40, 0x66, 0x60, 0x70, 0x76, 0x0c, 0x74, 0x84, 0x88, 0xb0, 0x71, 0x9d, 0x76, 0xcc, 0x92, 0x8a, 0xef, 0x87, 0x17, 0x90, 0xed, 0xfa, 0xde, 0x4f, 0xaf, 0x5d, 0x4f, 0x1e, 0x9f, 0x41, 0x2a, 0x8a, 0xde, 0xd0, 0x37, 0xad, 0x6d, 0x88, 0x4d, 0xde, 0xf6, 0x67, 0xcb, 0xed, 0xd5, 0x3f, 0x2f, 0xe3, 0xaa, 0x93, 0x7b, 0xf1, 0x7e, 0xc6, 0x3f, 0xf0, 0xf8, 0x1b, 0x00, 0x00, 0xff, 0xff, 0x01, 0xa4, 0xf0, 0xd9, 0x10, 0x01, 0x00, 0x00, } // Reference imports to suppress errors if they are not otherwise used. var _ context.Context var _ grpc.ClientConn // This is a compile-time assertion to ensure that this generated file // is compatible with the grpc package it is being compiled against. const _ = grpc.SupportPackageIsVersion4 // StenographerClient is the client API for Stenographer service. // // For semantics around ctx use and closing/ending streaming RPCs, please refer to https://godoc.org/google.golang.org/grpc#ClientConn.NewStream. type StenographerClient interface { RetrievePcap(ctx context.Context, in *PcapRequest, opts ...grpc.CallOption) (Stenographer_RetrievePcapClient, error) } type stenographerClient struct { cc *grpc.ClientConn } func NewStenographerClient(cc *grpc.ClientConn) StenographerClient { return &stenographerClient{cc} } func (c *stenographerClient) RetrievePcap(ctx context.Context, in *PcapRequest, opts ...grpc.CallOption) (Stenographer_RetrievePcapClient, error) { stream, err := c.cc.NewStream(ctx, &_Stenographer_serviceDesc.Streams[0], "/steno.Stenographer/RetrievePcap", opts...) if err != nil { return nil, err } x := &stenographerRetrievePcapClient{stream} if err := x.ClientStream.SendMsg(in); err != nil { return nil, err } if err := x.ClientStream.CloseSend(); err != nil { return nil, err } return x, nil } type Stenographer_RetrievePcapClient interface { Recv() (*PcapResponse, error) grpc.ClientStream } type stenographerRetrievePcapClient struct { grpc.ClientStream } func (x *stenographerRetrievePcapClient) Recv() (*PcapResponse, error) { m := new(PcapResponse) if err := x.ClientStream.RecvMsg(m); err != nil { return nil, err } return m, nil } // StenographerServer is the server API for Stenographer service. type StenographerServer interface { RetrievePcap(*PcapRequest, Stenographer_RetrievePcapServer) error } func RegisterStenographerServer(s *grpc.Server, srv StenographerServer) { s.RegisterService(&_Stenographer_serviceDesc, srv) } func _Stenographer_RetrievePcap_Handler(srv interface{}, stream grpc.ServerStream) error { m := new(PcapRequest) if err := stream.RecvMsg(m); err != nil { return err } return srv.(StenographerServer).RetrievePcap(m, &stenographerRetrievePcapServer{stream}) } type Stenographer_RetrievePcapServer interface { Send(*PcapResponse) error grpc.ServerStream } type stenographerRetrievePcapServer struct { grpc.ServerStream } func (x *stenographerRetrievePcapServer) Send(m *PcapResponse) error { return x.ServerStream.SendMsg(m) } var _Stenographer_serviceDesc = grpc.ServiceDesc{ ServiceName: "steno.Stenographer", HandlerType: (*StenographerServer)(nil), Methods: []grpc.MethodDesc{}, Streams: []grpc.StreamDesc{ { StreamName: "RetrievePcap", Handler: _Stenographer_RetrievePcap_Handler, ServerStreams: true, }, }, Metadata: "steno.proto", } stenographer-1.0.1/protobuf/steno.proto000066400000000000000000000016321372346644400203060ustar00rootroot00000000000000// Copyright 2019 Josh Liburdi and Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. syntax = "proto3"; package steno; service Stenographer { rpc RetrievePcap(PcapRequest) returns (stream PcapResponse) {} } message PcapRequest { string uid = 1; int64 chunk_size = 2; int64 max_size = 3; string query = 4; } message PcapResponse { string uid = 1; bytes pcap = 2; } stenographer-1.0.1/query/000077500000000000000000000000001372346644400153745ustar00rootroot00000000000000stenographer-1.0.1/query/parser.y000066400000000000000000000141161372346644400170650ustar00rootroot00000000000000// Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. %{ // Copyright 2014 Google Inc. All rights reserved. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. package query import ( "fmt" "net" "strconv" "strings" "time" "unicode" ) %} %union { num int ip net.IP str string query Query dur time.Duration time time.Time } %type top expr expr2 %type