Inbuilt Logging in Linux and Honeypots

User Activity Logs

User activity logs record user-initiated events occurring inside a host. A few of these tools can be configured to record a more granular level of detail as well. In minimum form, these logs should be configured to contain the time of occurrence, initiating user, program/command used and the result of the action. The program/command can mostly be assumed since each program logs to a specific file.

Each of the below tools in Linux record specific details related to user activity.

Logging tools in Linux
utmp and wtmp	Information about users currently accessing the system
btmp	Failed login attempts
lastlog	Most recent login information for all users
messages	contain global system messages, such as logs from a mix of other tools (mail, cron, kern, auth, daemon, etc), and messages logged during boot. This is replaced in some modern unix distributions by a syslog file. Among the many tools logged, auth logs contain system authorization information such as user logins, authentication mechanism used in those log ins, and session starting/ending information.

An important aspect in Network Security is preserving an audit trail for the activities that were performed in a host. This is required by certain standards such as ISO 27000, but even more so, they are required to establish non-repudiation and accountability in case of a security sensitive event.

Intrusion detection is another important aspect of Network Security. In this context, user activity logs can be used for generating, and optimizing user behavior profiles. User behavior profiles are the most effective in intrusion detection due to their tailored nature, apart from recent machine learned systems. They make it is easy to detect when a user does something out of character, rather than combined system-wide information which is likely to generate a large number of false positives due to varying actions of individual users. Furthermore, thresholds for statistical analysis can be determined from user activity logs, and anomaly detection systems can be more intuitive information to reduce false positive rates. Log feedback can also be used to optimize attack signatures or firewall rule sets.

Kernel logs and daemon messages logged to “messages” also provide crucial details for hardware maintenance, important in assuring availability and physical security. They could further be used to record System On/Off periods, information useful in discovering out of band activities (e.g. user absence).

User activity logger: Syslogd

Logs cannot be indefinitely stored in host machines due to storage concerns, especially in servers that serve a large number of users, generating a mass of logs each day. Therefore, a user activity logger such as syslog daemon can be used to forward event messages to a dedicated logging server.

Syslogd reads from a Unix domain socket (named /dev/log), to which logs are written by many applications, and a configured UDP socket to which log messages can be sent from other machines. The logs must contain certain classification information to be accepted by Syslogd, namely, a “facility” which is the owner of submitted log (The kernel, the mail subsystem, or a server), and a “priority” (or “level”) which usually falls into one of four levels (“debug”, “informational”, “warning”, “critical”). The log levels can be configured to be more granular. Based on classification information in the messages and its configuration file, Syslogd routes them in various ways such as logging them to the system console, log files, other machines and/or daemons, or sending as emails to specified users.

Protection against security breaches requires being proactive, which requires regular inspection of log files. Regular inspections warn about malfunctioning equipment, or provide the continued assurance of a healthy, protected network. It is difficult to do this in a large network at multiple endpoints. To add to this, some intrusions would go amiss if certain log data were missing in a network wide view. For example, a denial of service would be conducted in a low priority server to prod attention away from an attack on a high value server. Detection and prompt response requires visibility of what is happening at multiple end-points at the moment they are happening. Therefore, it is important to deliver logs, in near real-time to a unified location, and user activity loggers such as syslogd can be utilized for this purpose. Alternatively, logs could also be distributed to separate assigned personnel for better insight and understanding, depending on the requirement.

In a system crash or a compromise, the integrity of log files takes priority. Intact, verifiable logs could absolve an organization/user of blame, disreputation and financial loss. Due to legal requirements, log files for a specific amount of time such as for one year may require to be preserved, with insufficient local storage. To server these integrity and storage related needs, they should be stored in a separate server and user activity loggers can be used for this purpose.

It could also prove to be extremely useful in forensic analysis after the occurrence of an event, but few preconditions have to be met in this case. The log rotation has to be configured to deliver the archived, expired logs to an external server. System clocks in the network has be synchronized to get most accurate timestamps in logs. A keyed hash should be stored with each log entry so their integrity can be established in reconstructing events of a security incident and preserving their state as legally admissible evidence.

Are these tools sufficient for a honeypot?

It is possible to construct a high interaction honeypot using above tools but such a honeypot would be somewhat limited. Since no protocol stack emulating software would be utilized, it would have to be a high interaction honeypot. But since the purpose of high interaction honeypots is to capture the maximum amount of information on threat landscape, to justify the dedicated hardware costs, it would require additional loggers to be of maximum practical use.

Aforementioned user activity logs would provide user logs and some of the system logs. Network data will have to be logged separately. If possible, software keystroke logging should also be enabled for maximum input.

Since the purpose of a honeypot is to lure the attackers away from actually valuable resources, Honeypot would have some data of false value, or resources with specs attractive enough for attackers to want to compromise the honeypot for botnets. However, since this opens the organization up to legislative liabilities, it’s wise to filter outgoing traffic. But a concern remains that this may make the attacker suspicious and render the machine unattractive to him.

For such a honeypot to be actually useful, the log integrity would have to be assured with a keyed hash or a checksum, as the attacker could rewrite the logs with false entries and respective hashes. The attacker could simply wipe/destroy the logs as well. Therefore, all logs should be delivered to a dedicated syslog server running syslogd. This server should be configured to only accept log files at the given ports with no outgoing traffic, to preserve integrity and to prevent it from being compromised. The syslog server should look inconspicuous in the network even though it has most ports disabled. It should be expected that honeypot would be use to attack the internal network and therefore all communicating servers should be secured appropriately.

Another aspect is that, it would be best to tunnel such log delivery through alternative methods (ICMP tunnels, encrypted UDP) as to not tip off the attacker. A tunneling method which looks malicious would be the most tactful since it could convince an attacker that system is already compromised and draw him in with “a false sense of security”.

In summary, while it is possible for aforementioned tools to server important roles in a honeypot, a practically usable honeypot would need to cover a few more aspects, least of which being the logging of network transmissions as they would be the most telling of the logs.