» tagged pages
» logout

(Feed found, click Add Page to syndicate.) Error finding feed, please try again » Find feed title

A Blog Page allows you to add entries, for news or other time sensitive postings

(Login required to save to your tagged pages.)
(or Cancel)

Make further edits, (or Cancel)

(Login required to save to your tagged pages.)
(or Cancel)

(Editing anonymously: to be credited for your changes, login or register a new account)

Change Page Permissions? Changing these permissions will adjust who can modify this page.

Anonymous (change)
Swik Users (change)
(or Cancel)
Upload an image from your computer:
or Copy an image from a URL:
or Erase the current icon:
Icon Preview:

or Cancel

Erase logging? The contents of logging page and all pages directly attached to logging will be erased.

or Cancel

(Editing anonymously: to be credited for your changes, login or register a new account)

other page actions:
logging

logging

sorted by: recent | see : popular
Content Tagged logging

MySQL Conference Liveblogging: Optimizing MySQL For High Volume Data Logging Applications (Thursday 2:50PM)

  • http://en.oreilly.com/mysql2008/public/schedule/detail/874
  • presented by Charles Lee of Hyperic
  • Hyperic has the best performance with MySQL out of MySQL, Oracle, and Postgres in their application
  • I suddenly remember hyperic was highly recommended above nagios in MySQL Conference Liveblogging: Monitoring Tools (Wednesday 5:15PM)
  • performance bottleneck
    • the database
      • CPU
      • memory
    • IO
      • disk latency
      • network latency
    • slow queries
  • media size deployment example
    • 300 platforms (300 remote agents collecting data)
    • 2,100 servers
    • 21,000 services (10 services per server), sounds feasible
    • 468,000 metrics (20 metrics per service)
    • 28,800,000 metric data rows per day
    • larger deployments have a lot more of these (sounds crazy)
  • data
    • measurement_id
    • timestamp
    • value
    • primary key (timestamp, measurement_id)
  • data flow
    • agent collects data and sends reports to server with multiple data points
    • server batch inserts metric data points
    • if network connection fails, agent continues to collect but server "backfills" unavailable
    • when agent reconnects, spooled data overwrite backfilled data points (why not use REPLACE for all inserts?)
  • things are very basic so far
  • batch insert
    • INSERT INTO TABLE (a,b,c) VALUES (0,0,0), (1,1,1),…
    • using MySQL batch insert statements vs prepared statements with multiple queries in other databases seems to improve overall performance by 30%
    • batch inserts are limited by 'max_allowed_packet'
  • other options for increasing insert speed
    • set unique_checks=0, insert, set unique_checks=1 (definitely need to make sure data is valid first)
    • set foreign_key_checks=0, insert, set foreign_key_checks=1 (same concerns as above)
    • Hyperic doesn't use the 2 above
  • INSERT … ON DUPLICATE KEY UPDATE
    • when regular INSERT fails, retry batch with INSERT ON DUPLICATE KEY syntax
    • it's much slower but it allows
  • this is all basic, where are the performance tweaks?!
  • batch aggregate inserter
    • queue metric data from separate agent reports
      • minimize number of inserts, connections, CPU load
      • maximize workload efficiency
    • optimal configuration for 700 agents
      • 3 workers
      • 2000 batch size seems to work best
      • queue size of 4,000,000
    • this seems to peak at 2.2mil metric data inserts per minute
  • data consolidation
    • inspired by rrdtool
    • lower resolution tables track min, avg, and max
    • data compression runs hourly
    • size limit 2 days
    • every hour, data is rolled up into another table that holds hourly aggregated values with size limit 14 days, then that one gets rolled up into a monthly table, etc
    • this is is a good approach if you don't care about each data point
  • I'm overwhelmed by the amount of "you know"s from the speaker. Parasite words, ahh! Sorry Charles )
  • software partitioning
    • measurement data split into 18 tables, representing 9 days (2 per day)
    • they didn't want to do more than 2 SELECTs to get data per day, hence such sharding
    • oddly, Charles didn't actually use the word 'shard' once
    • tables truncated, rather than deleting rows => huge performance boost
    • truncation vs deletion
      • deletion causes contention on rows
      • truncation doesn't produce fragmentation
      • truncation just drops and recreates the table - single DDL operation
  • indexes
    • every InnoDB table has a special index called the clustered index (based on primary key) where the physical data for the rows is stored
    • advantages
      • selects faster - row data is on the same page where the index search leads
      • inserts in (timestamp) order - avoid page splits and fragmentation
    • shows comparison between non-clustered index and clustered index (see slides)
  • still no mention of configuration tweaks
  • UNION ALL works better than inner SELECTS because the optimizer didn't optimize them enough (at least in the version these guys are using, not sure which)
  • recommended server options are on the very last slide, I was waiting for those the most! I guess I'll look up the slides after
Similar Posts:

MySQL: Planet MySQL

Configure Snort to log packets to MySQL - Program - Linux - Builder AU

Last week, we looked at setting up Snort, a Network Intrusion Detection System. Now we will look at configuring Snort to log packets to<sep/>

snort: del.icio.us/tag/snort

Using Syslog information with SourceLabs Linux Self-Support

Below is documentation for using SourceLabs Self-Support Suite for Linux and Open Source Java to gather, organize, and search Syslog messages.

You can setup syslog to connect to the relay. In this release, we only support syslog over UDP. To set this up, you need to modify your syslogd config file. Often this is in /etc/syslog.conf

To get started, you can add

. @relayMachineName:4657

This will send all your syslog messages to a machine named relayMachineName over port 4657. Port 4657 is the default port the relay listens to for Syslog UDP messages. You can change this in the relay’s property file. This properties file is located in the ‘conf’ directory of your relay installation (e.g. /opt/sourcelabs/relay/conf).

To filter the syslog messages sent to the relay, please read the syslog documentation or man pages for more information.

Syslog-ng:

You can also setup syslog-ng to use the relay. For the current version of the relay, you will need to use UDP logging to the relay instead of TCP. To set this up, you need to modify your syslog-ng config file. Often this is in /etc/syslog-ng.conf

To get started, you can add
destination d_relay { udp("relayMachineName ", port(4657)); }; log { source(s_sys); destination(d_relay); };

Assuming that you are using a default syslog-ng configuration where source ‘s_sys’ represents all syslog messages (modify source as necessary for your configuration),this will send all your syslog messages to a machine named relayMachineName over port 4657. Port 4657 is the default port the relay listens to for Syslog UDP messages. You can change this in the relay’s property file. This properties file is located in the ‘conf’ directory of your relay installation (e.g. /opt/sourcelabs/relay/conf).

To filter the syslog messages sent to the relay, please read the syslog documentation or man pages for more information.

Find more information about SourceLabs Self-Support Suite for LInux and Open Source Java

Log4jXmlFormat - Logging-log4j Wiki

ender-ref ref="console" />

XML: del.icio.us/tag/xml

Page 1 | Next >>
Username:
Password:
(or Cancel)