» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with Personal + MySQL

Power-Efficiency Study For MySQL Database Server

Introduction

The purpose of this document is to explain the power-efficeincy analysis proess, power-efficiency issues and solutions in MySQL database server as an example to help identify the right tools and procedures that support the development of energy-efficient application.


Analysis Process

  1. Install and start test application(MySQL) on idle system, create workload to simulate the scenario that multiple concurrent clients connecting to the MySQL server without executing any queries.

  2. Run PowerTop on Solaris to see how much time the CPUs are spending running in lower power states, and how much time the CPU are spending running in C0 state(meaning CPU is actually executing instructions.)

  3. Use DTrace to profile application understand the source of the wakeups causing power-consuming activity.

The remainder of the document covers the detail information on each of the above procedures


Test Setup And MySQL Configurations

HW Installation:

Host Server Type

SunFire x4150Server

CPU: 8x2826MHz Intel-Xeon

Memory: 16GB

Client Server Type

SunFire v20z server

CPU: 2x1793MHz AMD

Memory:2G


SW Installation:

Host OS

Solaris snv_96 x86

MySQL Database Server

5.0.51

Test App

Sysbench

commenting query execution:

//rc = mysql_stmt_execute(stmt->ptr);

Sysbench test Installation Steps:

  • shell> cd /sysbench-version

  • shell> ./configure

  • shell> make

  • shell> make install

MySQL Configurations:

a. Standalone MySQL server

default configuration

b. Master-slave MySQL Server

master db server: /etc/my.cnf

[mysqld]

server-id=1

log-bin= /usr/local/mysql/data/mysql-bin.log


expire_logs_days = 10

max_binlog_size = 500M

slave db server: /etc/my.cnf

[mysqld]

server-id=2

master-host = v20z-241-30

master-user = slave_v20z

master-password = passw0rd

master-connect-retry = 60


C. Master-master MySQL Server

master db server: /etc/my.cnf

[mysqld]

replicate-same-server-id = 0

server-id=1

auto-increment-increment = 2

auto-increment-offset = 1


master-host = 10.6.241.106

master-user = slave_x4150

master-password = passw0rd

master-connect-retry = 60


log-bin= /usr/local/mysql/data/mysql-bin.log


expire_logs_days = 10

max_binlog_size = 500M


Command To run 16 concurrent user sysbench oltp test(no queries):

standalone mysql server configuration:

  • shell> sysbench –test=oltp –oltp-table-size=100000 –mysql-db=sbtest –max-requests=0 --oltp-dist-type=special —num-threads=16 run

replication mysql server(M/S, M/M) configuration:

  • shell> sysbench –test=oltp –oltp-table-size=100000 –mysql-db=sbtest –max-requests=0 --oltp-dist-type=special –mysql-host=x4150-241-06 —num-threads=16 run


Test Observations and Results

Before starting MySQL server, the system is idle.

shell>powertop

Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation              

Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (3.7%)          2826 Mhz        100.0%          
C1                      2.5ms   (96.3%)                                                                                                                     Wakeups-from-idle per second: 384.9     interval: 8.0s                          
no ACPI power usage estimate available

Top causes for wakeups:                                                         
26.0% (100.0)               <kernel> :  genunix`clock                           
25.9% ( 99.8)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
19.1% ( 73.4)                  sched :  <scheduled timeout expiration>          
 8.7% ( 33.4)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 4.4% ( 16.8)                   java :  <scheduled timeout expiration>          
 1.0% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdlr              
 1.0% (  4.0)               <kernel> :  genunix`schedpaging

The powertop tool for Solaris x86 reports how much time CPUs spending in lower power states(C1), and how much time the CPUs are runing on C0 state(executing instructions). Even though the system show idle  from processor monitor tools(vmstat, mpstat), the powertop can show that not 100% of its time running at the C1 state, but arround 3.7% time running on C0 state with wakeups  mainly from some kernel activities.

Test 1: Standalone MySQL server Configuration

When running 16-concurrent users  sysbench test, the powertop report the similar data as expected since there is no queries executed and the MySQL server didn't do any work:

Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation

Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (3.9%)          2826 Mhz        100.0%          
C1                      2.5ms   (96.1%)                                         
                                                                             
                                                                                
Wakeups-from-idle per second: 389.3     interval: 8.0s                          
no ACPI power usage estimate available

Top causes for wakeups:                                                         
25.8% (100.2)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
25.7% (100.0)               <kernel> :  genunix`clock                           
18.8% ( 73.1)                  sched :  <scheduled timeout expiration>          
 8.6% ( 33.3)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 4.3% ( 16.6)                   java :  <scheduled timeout expiration>          
 1.0% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdlr              
 1.0% (  4.0)               <kernel> :  genunix`schedpaging

Test 2: Master-slave MySQL server configuration

On slave server, the powertop report :

 Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation              

Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (27.3%)         2826 Mhz        100.0%          
C1                      0.1ms   (72.7%)                                         
                                                                          
Wakeups-from-idle per second: 9846.8    interval: 8.0s                          
no ACPI power usage estimate available

Top causes for wakeups:                                                         
33.5% (3301.1)           <interrupt> :  e1000g#0                                
15.6% (1532.4)           <interrupt> :  aac#0                                   
 1.0% (100.2)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
 1.0% (100.0)               <kernel> :  genunix`clock                           
 0.7% ( 73.2)                  sched :  <scheduled timeout expiration>          
 0.3% ( 33.3)               <kernel> : ehci`ehci_handle_root_hub_status_change 
 0.2% ( 16.7)                   java :  <scheduled timeout expiration>


Use DTrace tracing MySQL's hot function calls to help understand why MySQL slave server is causing CPU wake up from idle state to waste power:



#!/usr/sbin/dtrace -qs


pid$1:::entry

{

self->ts = vtimestamp;

}


pid$1:::return

/self->ts/

{

@a["Count",probefunc] = count();

@b["Time",probefunc] = sum(vtimestamp - self->ts);

self->ts = 0;

}


tick-10sec

{

exit(0);

}


Identified that the cuase of the wakeups was the known MySQL's bug: http://bugs.mysql.com/bug.php?id=33815

Re-configure MySQL's slave server's “server-id = 3” in /etc/my.cnf, start 16 con-current users sysbench test, the powertop report close to 96% low power state(C1) and there is no wakeups from MySQL.


Test 3: Master-master MySQL server configuration

When running 16-concurrent users  sysbench test connecting to the MySQL master server, the powertop report no wakeups from MySQL as expected since there is no queries executed by the MySQL server:

Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation              

Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (4.7%)          2826 Mhz        100.0%          
C1                      1.9ms   (95.3%)                                         
                                                                                
                                                                                
                                                                                
                                                                                
Wakeups-from-idle per second: 496.1     interval: 8.0s                          
no ACPI power usage estimate available

Top causes for wakeups:                                                         
20.2% (100.2)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
20.2% (100.0)               <kernel> :  genunix`clock                           
14.8% ( 73.4)                  sched :  <scheduled timeout expiration>          
 6.7% ( 33.3)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 3.4% ( 16.7)                   java :  <scheduled timeout expiration>          
 0.8% (  4.1)            <interrupt> :  e1000g#0                                
 0.8% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdlr              
 0.8% (  4.0)               <kernel> :  genunix`schedpaging                     
 0.4% (  2.0)               <kernel> :  fcip`fcip_timeout

Power and Performance Measurement

With the different MySQL configuration(single server, replication server) in the tests above, we can see when MySQL's running on idle system, CPU can spend the lowest power states. In order to reduce the energy costs, we can work on improving MySQL applicaiton's performance(response time) to keep system in idle state longer saving the CPU resource.

There are different places that we can make changes to improve performance of a MySQL application: schema optimization and indexing, query performance optimization, tuning database server settings. The more detail and deep information on MySQL performance was covered in the guide book of High Perormance MySQL,2nd edition written by the MySQL performance experts. In this document, I have research test results bellow showing that increasing MySQL performance with these methods can actually reduce CPU utilization to save energy.


Performance Test Tool

  1. mysqlslap: emulate client load for MySQL server. This tool is bundled with MySQL 5.1 server binary on Solaris OS

  2. database schema: world schema(availabe at: http://dev.mysql.com/doc)

  3. Execute custom query file with mysqlslap:

    shell>mysqlslap -uroot -p -q ./myqueries.sql --create-schema=world -i 8

Test 1: Indexing

Create a new table city_huge based on City table in world database

mysql>create table city_huge select * from City;

mysql>insert into city_huge select * from City; (4 times)

mysql>alter table city_huge add index name_btree(Name); (create index on Name column)


Execute the query file using the index:

shell>cat scan-index.sql

...

SELECT count(*) FROM city_huge force index (name_btree) where name = 'Amsterdam'

...


shell>mysqlslap -uroot -p -q ./scan-index.sql --create-schema=world -i 8

Benchmark
        Average number of seconds to run all queries: 0.110 seconds
        Minimum number of seconds to run all queries: 0.109 seconds
        Maximum number of seconds to run all queries: 0.111 seconds
        Number of clients running queries: 1

Average number of queries per client: 1000

shell>powertop
Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation              
Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (8.6%)          2826 Mhz        100.0%          
C1                      0.3ms   (91.4%)                                         
                                                                               
Wakeups-from-idle per second: 2633.3    interval: 8.0s                          
no ACPI power usage estimate available
Top causes for wakeups:                                                         
 3.8% (100.0)               <kernel> :  genunix`clock                           
3.8% ( 99.8)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
 2.8% ( 73.3)                  sched :  <scheduled timeout expiration>          
 1.3% ( 33.3)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 0.6% ( 16.8)                   java :  <scheduled timeout expiration>          
 0.2% (  6.4)            <interrupt> :  aac#0                                   
 0.2% (  6.4)            <interrupt> :  e1000g#

Execute the query file without using the index:

shell>cat scan-noindex.sql

...

SELECT count(*) FROM city_huge ignore index (name_btree) where name = 'Amsterdam'

..

# mysqlslap -uroot -p -q ./scan-noindex.sql --create-schema=world -i 8
Benchmark                                                                                            
        Average number of seconds to run all queries: 2.380 seconds     
        Minimum number of seconds to run all queries: 2.374 seconds    
        Maximum number of seconds to run all queries: 2.391 seconds   
        Number of clients running queries: 1                                             
        Average number of queries per client: 1000                                                                         
                                                                                                                                                                                                                                            
shell>powertop                                                                               
 Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation                                    
                                                                                                                        
Cn                      Avg     residency       P-states (frequencies)                             
C0 (cpu running)                (22.5%)         2826 Mhz        100.0%                          
C1                      0.4ms   (77.5%)                                                                           
Wakeups-from-idle per second: 1748.5    interval: 8.0s                                                             
no ACPI power usage estimate available                                                         
                                                                                                                         
Top causes for wakeups:                                                                                  
 5.7% (100.0)               <kernel> :  genunix`clock                                              
 5.7% ( 99.8)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
 4.2% ( 73.4)                  sched :  <scheduled timeout expiration>                     
 1.9% ( 33.3)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 1.0% ( 16.8)                   java :  <scheduled timeout expiration>                       
 0.2% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdlr                         
 0.2% (  4.0)               <kernel> :  genunix`schedpaging                                   

For all the database application, adding index is a great method to boost read-access performance. It allows MySQL spend less time find and retrieve the records from indexing instead of scanning the whole table. The time saving of table scan can boost the power usage of the database system. In the above test, we can see that using index can significantly improve the response time of queries and save power by increasing CPUs' spending in lower power states(C1) for arround 12%.

Test 2: Query Performance

Execute query file running join with two tables

#cat join.sql

select Country.Name from Country join city_huge on Country.Code=city_huge.CountryCode where

city_huge.Population> 8000000;

select Country.Name from Country join city_huge on Country.Code=city_huge.CountryCode where

city_huge.Population> 9000000;


#mysqlslap -uroot -p -q ./join.sql --create-schema=world -i 4

Benchmark
        Average number of seconds to run all queries: 0.003 seconds
        Minimum number of seconds to run all queries: 0.003 seconds
        Maximum number of seconds to run all queries: 0.005 seconds
        Number of clients running queries: 1
        Average number of queries per client: 2

shell>powertop
Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (6.3%)          2826 Mhz        100.0%          
C1                      1.3ms   (93.7%)                                         
                                                                                                              
Wakeups-from-idle per second: 726.3     interval: 8.0s                          
no ACPI power usage estimate available
Top causes for wakeups:                                                         
13.8% (100.0)               <kernel> :  genunix`clock                           
13.7% ( 99.8)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
10.1% ( 73.4)                  sched :  <scheduled timeout expiration>          
 4.6% ( 33.3)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 2.3% ( 16.8)                   java :  <scheduled timeout expiration>          
 0.9% (  6.9)            <interrupt> :  e1000g#0                                
 0.5% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdlr

Execute query file replacing the join quereis with subqueries

#cat subquery.sql

SELECT distinct Country.name FROM Country WHERE code IN (SELECT CountryCode FROM city_huge WHERE population > 8000000);

SELECT distinct Country.name FROM Country WHERE code IN (SELECT CountryCode FROM city_huge WHERE population > 9000000);


#mysqlslap -uroot -p -q ./subquery.sql --create-schema=world -i 4 
Benchmark
        Average number of seconds to run all queries: 10.568 seconds
        Minimum number of seconds to run all queries: 10.523 seconds
        Maximum number of seconds to run all queries: 10.600 seconds
        Number of clients running queries: 1
        Average number of queries per client: 2

#powertop
Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation        
                                                                                                        
Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (19.0%)         2826 Mhz        100.0%     
C1                      1.1ms   (81.0%)                                         
                                                                                            
Wakeups-from-idle per second: 731.0     interval: 8.0s       
no ACPI power usage estimate available                            
                                                                                                       
Top causes for wakeups:                                                                
13.7% (100.0)               <kernel> :  genunix`clock                           
13.6% ( 99.8)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
10.0% ( 73.4)                  sched :  <scheduled timeout expiration>          
 4.6% ( 33.3)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 2.3% ( 16.7)                   java :  <scheduled timeout expiration>          
 0.5% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdlr              
 0.5% (  4.0)               <kernel> :  genunix`schedpaging
The above test shows the MySQL optimizer's specific problem on converting IN subquery from non-correlated subquery to correlated subquery, which searches for totally N * M rows of the inner and outer queries. By translating the subquery to the Join query, it saves time by retrieving only N+M rows of data in the inner and outer queries from the database. The improved response time of the Join queries save the power of the database system by incresing CPUs' spending in lower power states(C1) for arround 16%.

Test 3: MySQL Server configurations

MySQL's query cache stores the identical SELECT queries issued by client to the database server. By default, MySQL set the query_cache_size as 0.  In this test , configure MySQL server to set query_cache_size as 1M
mysql>set global  query_cache_size=1m

# mysqlslap -uroot -p -q ./subquery.sql --create-schema=world -i 4
Enter password: 
Benchmark
        Average number of seconds to run all queries: 2.630 seconds
        Minimum number of seconds to run all queries: 0.000 seconds
        Maximum number of seconds to run all queries: 10.522 seconds
        Number of clients running queries: 1
        Average number of queries per client: 2

#powertop
Solaris PowerTOP version 1.0       (C) 2007 Intel Corporation              
Cn                      Avg     residency       P-states (frequencies)          
C0 (cpu running)                (10.4%)         2826 Mhz        100.0%          
C1                      1.5ms   (89.6%)                                         
Wakeups-from-idle per second: 585.4     interval: 8.0s                          
no ACPI power usage estimate available
                                                                
Top causes for wakeups:                                                         
17.1% (100.0)               <kernel> :  genunix`clock                           
17.0% ( 99.8)               <kernel> :  uhci`uhci_handle_root_hub_status_change 
12.5% ( 73.3)                  sched :  <scheduled timeout expiration>          
 5.7% ( 33.4)               <kernel> :  ehci`ehci_handle_root_hub_status_change 
 2.9% ( 16.8)                   java :  <scheduled timeout expiration>          
 1.1% (  6.4)            <interrupt> :  e1000g#0                                
 0.7% (  4.0)               <kernel> :  uhci`uhci_cmd_timeout_hdl


After configure MySQL server to enable query cache, MySQL can save time and CPU resource on repetitivly parsing, optimization and execution of the same queries. The test above showed that the queries' response time can be redcued by nearly 4 times by retrieving query result set from the query cache directly and the power usage can be saved by incresing CPUs' spending in lower power states(C1) for arround 11%.


Conclusion

In this document, I use MySQL as an example to show the steps to measure the energy efficiency of the application software. By using the powertop tool on Solaris, we can see whether the system can take advantage of power management features when it is idle; or we can use dtrace tool on Solaris to analyze the source of the power management issues report in the powertop tool, and further improve power efficiency by developing and tuning high performance application.


References

MySQL: Planet MySQL

MySQL? European Customer Conferences 2008

There will be 3 conferences in Europe for Customers and Users:

As usual there is an early bird discount if you register until August 31st.

More details on http://www.mysql.com/news-and-events/european-conferences/2008/.

I will attend the Central Europe Conference, barring accidents. Here’s a map from Google?:


View Larger Map

To be honest, I got some difficulties some weeks ago when I registered because the registration page wasn’t entirely translated in English and, despite I would learn it, I don’t understand German :(

Same problem when I tried to read the Agenda in more details… (Why don’t you translate it in English too?)

Feel free to post a comment if you are from Italy, Slovenia or Croatia and you would like to attend the Central Europe Conference too. We can meet in Trieste to head together to Germany : )

See you there!

MySQL: Planet MySQL

Vista?, Virtual Box, Solaris? 10, MySQL? Cluster

..aka how to spend a magnificent Sunday-Monday in SUN’s company :)

If your applications run on Windows? and you need a MySQL? Cluster for testing purposes you can use Virtual Box to install SUN Solaris?  10 and set up the Database Cluster creating 4 Solaris? Zones (4 IP are required for a minimal Cluster setup).

It’s an interesting scenario.

You will need:

You can create the Solaris? Zones manually or you can create them at once using a script file.

Note that you don’t need to install 4 Operating Systems. You’ll install 1 Solaris? 10 and then 4 Zones, each with a different IP: one for the MGT Node, one for the SQL Node, one for the Data Node 1 and one for the Data Node 2.

To get started with Solaris? Zones, I followed a very useful article by a Sun blogger - “Setting up MySQL Cluster using Solaris Zone”.

Unfortunately, after setting up Solaris? 10 and the Database Cluster I was not able to connect through SSH to the SQL Node from my Vista? machine, thus my Cluster was up, running but isolated and unreachable as well :(

After a long pause I got the idea for the solution of the issue (it’s typical): I forgot the configuration of port forwarding with NAT in the Solaris? 10 Guest System.

…The disadvantage of NAT mode is that, much like a private network behind a router, the virtual machine is invisible and unreachable from the outside internet; you cannot run a server this way unless you set up port forwarding..

ummh.. always RTFM (and check it again, and again…)!

I set up the port forwarding with the commands:

C:\Program Files\Sun\xVM VirtualBox>VBoxManage.exe setextradata “Solaris-10″ “VB
oxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/Protocol” TCP
VirtualBox Command Line Management Interface Version 1.6.2
(C) 2005-2008 Sun Microsystems, Inc.
All rights reserved.

C:\Program Files\Sun\xVM VirtualBox>VBoxManage.exe setextradata “Solaris-10″ “VB
oxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/GuestPort” 22
VirtualBox Command Line Management Interface Version 1.6.2
(C) 2005-2008 Sun Microsystems, Inc.
All rights reserved.

C:\Program Files\Sun\xVM VirtualBox>VBoxManage.exe setextradata “Solaris-10″ “VB
oxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/HostPort” 2222
VirtualBox Command Line Management Interface Version 1.6.2
(C) 2005-2008 Sun Microsystems, Inc.
All rights reserved.

However since making this change the Solaris? Guest did not start anymore and I got the following error:

Configuration error: Failed to get the “MAC” value.
VBox status code: -2103 (VERR_CFGM_VALUE_NOT_FOUND).

I found some posts - like this  - but they didn’t help me.

Conclusion: it’s 4:20 am and I am still looking for a solution to get my Cluster reachable, wondering why I want to use Solaris? before attending an intensive SUN Training ;-)

I really want to figure out why if I set up the port forwarding for SSH Connections the Guest does not start.. but.. I’d go to bed ;-)

I’ll try to comment in the next days after getting help from VB guru :)

MySQL: Planet MySQL

Andrii Nikiti?s son Needs Our Help

(Reposted from Zack Urlocker’s blog)

Andrii Nikitin, one of the MySQL support engineers located in Ukraine, has asked for help from MySQLers and so I’m sharing this information to the community at large. Andrii’s son Ivan, who is 2 1/2, is in need of a bone marrow transplant operation. This will require going to a clinic in Europe that will not be covered by regular insurance. So Andrii has aksed to see if we could help raise funds. The cost is expected to be 150,000 - 250,000 EUROS. A huge amount for an engineer from Ukraine to cover. But a small amount by many people could make a big difference.

Many MySQLers have kicked in to help out, but more is needed. Ivan’s health has taken a turn for the worse recently and the issue is now quite pressing. Even a small donation could mean the difference between life and death for Ivan.

I hope some of you who use MySQL or have young kids will join me in making a donation today. You can do so by using Paypal, by sending a check to MySQL, or via wire transfer.

Paypal:
http://tinyurl.com/6rxjsz
Or
by check payable to:

MySQL, Inc.
Mail to: MySQL, Inc.
Attn: Linda Dong
20450 Stevens Creek Blvd #350
Cupertino, CA 95014

or
US wire transfer:

MySQL Inc: 7396643001
SWIFT: NDEAUS3N

or
International wire transfer in any currency:
Bank: Nordea Bank
Bank address: Stockholm, Sweden
Bank account: 3259 17 03868
IBAN: SE27 3000 0000 0325 9170 3868
SWIFT: NDEASESS

Thanks to those who have donated already. A child’s life is precious and I hope we can give Ivan a chance.

MySQL: Planet MySQL

First days as a Sun Microsystems employee

As of July 1 all Canadian MySQLers are employees of Sun Microsystems. I was expecting that we would be the rear guard, I was going to greet the great day singing “a broken man on an Edmonton pier / the last of Monty’s engineers” (with thanks to the writer of “Barrett’s privateers”), or perhaps “And I, the last, go forth companionless … Among new men, strange faces, other minds.” (with thanks to Mr Tennyson). But as things turned out, the Ukrainians and the French were delayed. So Trudy Pelzer and I were the last Canadian engineers to come on board, but not quite last of all.

And it’s been wonderful. A free lunch, a tour of Sun’s Edmonton office, a SafeWord security doohickey, a carrying case, and an alternative address = peter.gulutzan@sun.com. More seriously, I’ve gained several thousand new co-workers while keeping the old ones (almost none of the MySQL engineers didn’t join Sun, and I know of none that intend to leave now that they’ve joined Sun). This is one of those friendly acquisitions where everything has just worked, and my hat is off to the integration teams who’ve kept us all in a high-morale gung-ho cloud. The new company is an even better friend of open source than the old, the number of intelligent people I’ve encountered is even greater, and the European level of politeness — always one of MySQL AB’s best points — hasn’t been polluted at all, touch wood. I highly recommend a Sun job for connoisseurs of tasteful engineering positions.

MySQL: Planet MySQL

I'm Joining Craigslist in July

I wasn't really looking for a new job a few months ago when I received an email from Eric Scheide (see Team Bios), the CTO at craigslist. He mentioned that they were looking for someone with MySQL experience and asked if I knew anyone. This sort of thing happens all the time.

craigslist

But this time it was different. Over the course of about three seconds, something clicked in my little brain and I realized that craigslist is a pretty unique combination of things: a small company with a solid financial base, a great service that I use myself, a focused groups of people who really care about doing things well, and an open-source friendly environment.

I replied that I might be interested myself and things kind of took on a life of their own from there. In the weeks that followed, I got the chance to meet much of the team (including CEO Jim Buckmaster and Craig himself). Each time I came away liking more and more about the team. I've also been impressed at how well the company takes care of its people and how thoughtful they are about making important decisions.

So after taking a few weeks off for some planned travel and unplanned relaxation, I'll start assimilating myself into the craigslist engineering culture and lending a hand wherever I can. Yes, some of that will entail going back to my MySQL roots.

The site is growing like a weed (still!), the people are great, and the focus is on providing a great service that anyone can use. At the same time, there are a lot of technical challenges (they get a ton of page views) and great opportunities to grow the site and give back to both the open source community and all the communities around the world that craigslist serves--a list that's growing all the time.

Not that I really care much what other people think, but the reactions I've had so far when telling people have been universally positive. Very positive. That tells me I'm on the right track.

The only real downside is that crigslist is in San Francisco and I'm in San Jose. So if you have thoughts on getting to the vicinity of 9th and Judah using public transit, let me know. I won't be commuting up every day, but I suspect I'll be trying a few options before settling on what works best.

Previously: Leaving Yahoo!, and The Aftermath.

Related

(comments)

MySQL: Planet MySQL

TRUNCATE now and unto ages of ages

MySQL’s TRUNCATE statement will evolve over the next few years, as we fix bugs or strange behaviours, and as we try to make storage engines more consistent with each other. Equally interesting are the aspects that will remain the same, as we confirm that our behaviour conforms to standard requirements.

Define TRUNCATE

TRUNCATE T has two definitions:
(1) it means “delete all rows in T”, as the MySQL Reference Manual says in section “TRUNCATE Syntax”. (2) it means “drop T, then create it again”, as the Reference Manual says later on. The better definition is (2). I have proofs.

Proof #1: The required privilege for TRUNCATE is DROP.

Proof #2: TRUNCATE does not cause activation of DELETE triggers. For example:
CREATE TABLE t (s1 INT);
CREATE TRIGGER t_ad AFTER DELETE ON t
FOR EACH ROW SET @a = @a + 1;
SET @a = 0;
INSERT INTO t VALUES (1),(2),(3);
TRUNCATE TABLE t;
Result: @a = 0, which shows that trigger t_ad was never activated.

Proof #3: TRUNCATE ignores locks, DELETE doesn’t. For example:
On Connection#1, say:
set @@autocommit=0;
create table t1 (s1 int, s2 blob) engine=innodb;
insert into t1 values (1,’a'),(2,’b'),(3,’c');
commit;
truncate table t1;
On Connection #2, say:
set @@autocommit=0;
update t1 set s2 = ”;
Result: the truncation succeeds, it is not blocked by the INSERT statement. If Connection#1 had said DELETE instead of TRUNCATE, there would be a lock. This is actually a danger point with TRUNCATE, by the way: you cannot expect locking to happen the same way that it would happen for DELETE statements.

Proof #4: TRUNCATE is much faster than DELETE if there are many rows in the table. For example:
CREATE TABLE t (s1 INT);
/* insert many rows */
DELETE FROM t;
/* insert many rows */
TRUNCATE TABLE t;
I timed the DELETE, it took several seconds. I timed the TRUNCATE, it took less than one second. This speed difference is what I would expect of a statement that affects only metadata, not rows.

There is no interesting side effect that’s due to TRUNCATE’s “drop and recreate” method, namely freeing of space in an underlying file. For example:
USE test
CREATE TABLE t (s1 INT);
/* insert many rows */
/* Look at size of underlying file in datadir/test */
DELETE FROM t;
DROP TABLE t;
CREATE TABLE t (1 INT);
/* insert many rows */
/* Look at size of underlying file in datadir/test */
TRUNCATE TABLE t;
Result: DELETE and TRUNCATE have the same effect on the underlying file. If the table was made with MyISAM, the file size becomes 0, either way. If the table was made with InnoDB and Per-Table Tablespaces, the file size is unchanged, either way. But in theory this could change, giving another “efficiency” reason to use TRUNCATE.

Against the advantages, one has to lay the things that are a little peculiar. They’re documented, but frankly the MySQL Reference Manual is getting a bit clogged with all the documentations of unintuitive non-obviosities.

The SQL:2008 Standard

In December 2007 the SQL standard committee published a draft document for the next standard, which I’m tentatively calling SQL:2008 in case it comes out in 2008. The draft document has new sections which describe non-core (that is, optional) features F200 and F202, which amount to support of the TRUNCATE statement.

TRUNCATE TABLE base_table_name
[ CONTINUE IDENTITY | RESTART IDENTITY ];

Differences between SQL:2008 and current MySQL implementation:

1. SQL:2008 allows
[ CONTINUE IDENTITY | RESTART IDENTITY ]
The default is CONTINUE IDENTITY.
MySQL has no such clause, and it nowadays (after Bugs 1514, 5033, 11946, 14945, 15754) handles auto_increment by sayng that TRUNCATE TABLE causes “restart”. It’s still not working correctly with partitions, see Bug#35111 ‘Truncate a MyISAM partitioned table does not reset the auto_increment value”.

2. SQL:2008 says TRUNCATE TABLE’s function is to “delete all rows” and TRUNCATE TABLE is a data-change statement, not a schema-change statement. MySQL says TRUNCATE TABLE is “DDL” because there’s an implied DROP TABLE, which means we treat it as a schema change.

There’s actually no problem about the main issue: The standard says “… all rows are deleted from T” but it doesn’t refer to sections that would cause execution of triggers, which means the effect is not what you’d expect with DELETE, but what you’d expect from drop + create. However, the fact that MySQL thinks TRUNCATE is DDL (vaguely like the standard’s “schema change” classification), and the tendency to autocommit, are certainly differences.

3. SQL:2008 says TRUNCATE TABLE is illegal on a referenced table of a foreign key. MySQL allows it for InnoDB, and WL#148 (the worklog task for foreign keys in MySQL version 6.1) disallows it for all tables where foreign keys are involved.

So right now we’re allowing something that we shouldn’t, and in future we won’t allow something that we should.

4. SQL:2008 says TRUNCATE TABLE should return “no data” if there were no rows in the table. MYSQL returns “okay” just as it does for DELETE. This is WL#4194 “Standard handling of ‘no data’”.

This relates as well to MySQL’s display of the number of affected rows.

5. SQL:2008 says TRUNCATE TABLE table_name. MySQL says TRUNCATE [TABLE] table_name.

6. SQL:2008 says TRUNCATE is a reserved word. MySQL says it’s not.

By the way, you’ll generally find that words that begin MySQL-server statements are reserved words, but besides TRUNCATE there are other exceptions: BACKUP BEGIN CACHE CHECKSUM COMMIT DEALLOCATE DO EXECUTE FLUSH HANDLER HELP INSTALL PREPARE REPAIR RESET RESTORE SAVEPOINT SLAVE START STOP UNINSTALL XA. I’ve seen a suggestion that TRUNCATE should be special, see Bug#12074 “TRUNCATE keyword not hilighted like SELECT, INSERT, etc.”.

7. SQL:2008 says the required privilege is “be the schema owner”. MySQL says the required privilege is DROP.

8. SQL:2008 says there is no trigger action. MySQL usually has no trigger action, but there’s a bug, Bug##27935 “TRUNCATE fires DELETE trigger if InnoDB table is referenced by a foreign key”.

All your standards are belong to us

I’ve just noticed that my future boss, Sun Microsystems CEO Jonathan Schwartz, has started off one of his blog posts by saying
“De facto standards are the only ones that matter.”
This is my opportunity to say “Brilliantly put, boss.” I’ll only add that the de facto standard (what all the SQL vendors do) and de jure standard (what SQL:2008 says) tend to converge.

I can see that because, in this wonderful age when every company puts its documentation online, I can act like a grocery-store price checker and wander among the competitor’s shelves. I can find TRUNCATE’s description in the Oracle 11g manual, the SQL Server 2008 manual, the Informix manual, the PostgreSQL 8.3 manual, and the DB2 Version 9.1 for z/OS manual. Of course, I find that there are details where the other vendors don’t follow the standard, just as MySQL doesn’t, see above. But in general, and in many of the details, I find that de facto = de jure.

I’ve seen this so often, and probably will devote a whole column to the subject someady … Convergence happens. Sometimes convergence happens because individual vendors commit themselves to adhering to an established standard feature. Sometimes the standard committee tries to reflect what goes on already, which is what’s happening here. In any case, the reason is rarely “academic”. TRUNCATE is hardly the sort of feature that one would add because of orthogonality or relational theory.

Even when the new standard becomes official, I doubt that the material on TRUNCATE will muchly affect MySQL unless users make requests. If any of the above are really burning issues for your installation, you can fill in a feature request form on bugs.mysql.com. Some of the TRUNCATE-related feature requests that I’ve seen already are
Bug#5507 TRUNCATE should work with views
Bug#17092 allowing TRUNCATE TABLE while LOCKing done
Bug#19405 No way to truncate a partition.
We have in-development plans for other tasks that mention TRUNCATE adjustments:
WL#3241 Truncate table in Maria
WL#2418 DDL Triggers (including perhaps a TRUNCATE trigger)
WL#3821 TRUNCATE for VIEWs (including perhaps a TRUNCATE privilege)
WL#3129 Consistent clauses in CREATE and DROP (including perhaps TRUNCATE IF EXISTS table1, table2 …)

MySQL: Planet MySQL

Back home after the meeting in Rome with Marten Mickos

Great conference yesterday at “La Sapienza” University in Rome with Marten Mickos and the Italian Team of Sun | MySQL.

I’ve just returned in Triest, my hometown, after 8 hours :( train ride.

Slides should be available the next week, here.

Marten, thank you for coming, I hope to see you in Italy once again!

MySQL: Planet MySQL

Personal Opinion: Half-baked = false analogy

Occasionally I hear the worry that MySQL might plan a feature that’s “half baked”. The term’s users include some of the world’s top MySQL experts so I’ll avoid a technical argument.

“In future we should not release a version with half-baked features and call it enterprise-ready.”
– Konstantin Osipov
http://www.xaprb.com/blog/2007/08/12/what-would-make-me-buy-mysql-enterprise/

“will Falcon be pushed hard as Innodb replacement even if it is half baked?”
– Peter Zaitsev
http://www.mysqlperformanceblog.com/page/2/?s=falcon&search_posts=true

As for me, though, I believe in features that others call “half baked”. I just don’t use the term much myself because it seems to convey a wistful regret. While “we’re halfway there” would generally be taken as a report of good progress.

The analogy seems to be with a cake: if you take a cake out of the oven and let it cool, and then find it’s not done, and then put it back in the oven to bake again, it won’t be good. But I prefer to analogize with a journey: if you take one step then you have completed a useful fraction of it. Of course, that assumes that the step is in the right direction, which means you must decide in advance what your destination is, and it is that — the failure to decide the destination — that causes embarrassing steps in seemingly random directions. I call those “drunkard’s walk” features, rather than “half baked” features. A bad sign is if someone at MySQL says “let’s do this because it’s low hanging fruit”, but I’m happy to say that I’m hearing those dread words less frequently nowadays.

So when I hear about a “half baked MySQL feature”, I’d want proof, not that this is only part of a job, but that it’s not getting us toward the job we really want.

MySQL: Planet MySQL

Personal Opinion: Timeouts

I can find most of MySQL’s “timeout” variables with a SHOW statement:

mysql> show variables like '%timeout%';+----------------------------+-------+

| Variable_name              | Value |

+----------------------------+-------+

| connect_timeout            | 10    |

| delayed_insert_timeout     | 300   |

| falcon_lock_wait_timeout   | 50    |

| innodb_lock_wait_timeout   | 50    |

| ...................................|

| interactive_timeout        | 28800 |

| net_read_timeout           | 30    |

| net_write_timeout          | 60    |

| slave_net_timeout          | 3600  |

| table_lock_wait_timeout    | 50    |

| wait_timeout               | 28800 |

+----------------------------+-------+

11 rows in set (0.01 sec)

The straightforward list conceals several anomalies and buried controversies.

* Should we allow non-integer values like 0.55 seconds, or perhaps make all values milliseconds rather than seconds, in order to reflect the importance of blocking for as little time as possible?

* Should we allow the value to be zero? If so, does zero mean “there is no timeout” (i.e. zero = infinity), or does 0 mean “timeout should go on for zero seconds”?

* Should the default be 50 for more cases, or tailored to something that looks less arbitrary?

* Should it be possible to set timeouts per session, or should it always be done with GLOBAL?

It’s a set of questions that we had to revisit in January when falcon_lock_wait_timeout appeared, and we’ll see it again for an upcoming timeout variable in BACKUP. And my opinion is: I wish we had a magic formula for every question that ever arose at MySQL.

MySQL: Planet MySQL

Feature: Run Your Personal Wikipedia from a USB Stick

You don't have to lease server space or keep your home computer always on to access a personal web server

MediaWiki: del.icio.us/tag/mediawiki

Getting The Most Out Of The MySQL Conference

As half of the world population already knows, the MySQL conference is coming in less than 3 weeks. Since this event only happens once a year, lasts only 4 days, and costs more than a Russian mail-order bride, I'd really like to get the most out of it. Considering that the schedule is completely packed, with 8 (!!) events going on in parallel, I imagine things can get a little frantic. Additionally, I've never been to a conference of such size before and I'm not sure what to expect.

So… I'm contemplating:

  • printing out the event schedule and drawing a zig-zagging "map" of exactly where I'll be jumping to next, once the previous presentation ends. I'm actually wondering if I'll need to figure out where all the events are located exactly in advance. How big is that place? Did Google invent in-building walking maps yet? Do people normally jump from one presentation to another parallel one or is that unheard of?
  • bringing a laptop to take notes. I find it that my brain tends to retain mostly the general ideas for a good period of time. Code details and specifics tend to flush a lot sooner. Keeping notes (and publishing them online) is the best way to retain all this tasty information. Learn it and starting doing it, don't be lazy. For my note taking application, I actually prefer Microsoft (:gasp:) OneNote. It keeps things organized and has a coupe of neat tricks up its sleeve, like built-in OCR, Win-S shortcut for a quick area-defined screenshot, integration with Outlook, audio note-taking. Aha!..
  • recording audio at every presentation, is that allowed?
  • getting plenty of sleep the night before each conference day, as the amount of information is going to be simply crushing. I guess I'm going to have to postpone my 3am sessions until Friday or so.
  • bribing an organ thief to steal Peter Zaitsev's brain and replace it with a statistical computer chip capable of running 17 billion MySQL benchmarks a second. Nobody is going to notice the difference anyway.

Do you have any tips? How do YOU handle conferences? Please share in the comments.

MySQL: Planet MySQL

What I love about the MySQL Community

The people, plain and simple.

I encourage you, if you are not part of the local MySQL community, then why not? Find a local MySQL Meetup group, attend the MySQL Conference, get online to IRC @ freenode#mysql, subscribe to some lists or read the forums. I didn’t meet these people being an observer, but part of the community.

As I wrote recently with some vacation time and the need to pick the brain of some experts I organized a quick trip to Germany to specifically visit Jan Kneschke and Paul McCullagh to improve my MySQL skills in certain areas on my own time (something I’d find harder to want to do at Sun when they owned all IP & copyright even during my personal time, so what incentive is there?)

As with the best laid plans of mice and men, the mention a few weeks earlier of skiing (my number one passion) and the chance to do this with Michael Zinner in Austria was too much a golden opportunity to miss, so this was added to my trip.

But it didn’t stop there. These were my plans yet news of my visit brought out others to want to see me during my visit. With Jan, additional events organized included my chance to see Kai Voigt, and then Jens Bollmann and Ulf Wendel. With Paul, lunch was organized with Lenz Grimmer. This is one beauty of a distributed company such as MySQL, people in lots of different places and something I am most glad that will not be lost just simply because I’m no longer an employee.

In the past year I’ve spent quality time with other MySQL Community people and also not just from MySQL. This past twelve months has included time with Sheeri Kritzer Cabral in Boston many times (including her wedding), Jay Pipes in Columbus (including also going there for Thanksgiving), Chad Miller in Orlando (even just to visit one day to see a shuttle launch), Gary Whizin in Santa Cruz, Baron Schwartz in Charlottesville and of course Farhan Mashraqi in New York. It’s great to get to know people better personally, to meet spouses and partners and to just consider life in general. People and their own goals and dreams are far more significant then just our own careers.

During my time at MySQL I’ve also made friends with several clients and have spent time with them in varying locations in the US including New York, Jacksonsville, Chicago and San Francisco.

In the past week following a request for any Linked In recommendations of my work, I’ve exceeded my goals of 25, and I’m now at 31 of 166 contacts, many from members of the MySQL community and clients.

I’m sure 2008 will be no different. Any offers for me to visit are always most welcome and to live by my motto “Life is Short. Live Life. Life is an Adventure”.

MySQL: Planet MySQL

Telecommuting in 2008

Considering the few posts made recently, regarding Telecommuting, I thought I’d put in my $0.02, adjusted for inflation.
See Cal’s Post and this awesome job opportunity.
I want to discuss the various tools and options available to the telecommuter in 2008.

Disclaimer: I don’t want to make this another post on the benefits of Telecommuting, as I strongly believe there is no silver bullet for the problems an individual or a company faces. There are many drawbacks to telecommuting, and the positives don’t always neutralize the drawbacks.

Background: I’ve been telecommuting since 2003, that’s when I decided to quit working for the ‘man’. I chose a laptop to give myself mobility. This mobility also came in the form of being single, unattached, and my willingness to drive. I drove from New Jersey to San Antonio, to Corpus Christi, TX, back to San Antonio, to San Francicso, to Los Angeles, to Boca Raton, FL, to Alexandria, VA, Birmingham, AL, to Herndon, VA to San Antonio TX.

All of this traveling was done between January 2003 and October 2005. So, for almost two years I lived in hotels, and everything I owned, fit in my Nissan Sentra. This was telecommuting 1.0, I call it telecommuting because quite often, I could work from my hotel room, they just wanted me close by, in case of meetings.

As much as I loved driving and finding new places, hotel life does get a bit tiring. So, in 2005 I decided to settle down in San Antonio. I’m sure a lot of you will wonder why I chose Texas of all places, to which I can only say ‘try it’.

Here are some tools I use to help keep my sanity, and my contract.

  1. Laptop: I use a macbook pro as my one and only machine, the laptop is my primary means of income. My laptop is almost 1.5 yrs old, and it’s still more than what I need. My only gripes are:

    1. Glossy Screens: I chose the glossy screen from Apple, it’s a great looking screen, but it’s not the best in all situations (outdoors).
    2. Resolution: 1680×1050 is nice, but I think I’d prefer 1920×1200, so I want to buy the new MacBook Pro’s, but I don’t know what to do with this laptop first. I can’t logically, or fiscally justify two laptops.

    Since OS X has unix at the core, I have a local development environment for all my projects. I’ve tried out things like MAMP, MAMP Pro, but I keep going back to my old Linux roots. I have the default Apache, and a customized PHP installation, along with MySQL.

    I use subversion for my source control, though I am thinking of switching to GIT. But, whether you use GIT or SVN, it doesn’t matter as long as you use something it’s good. Basically, the use of SVN, Apache, PHP and MySQL allows me to have a full local dev environment. I can write new code, make bug fixes, document, whatever, all from the comfort of my laptop. And I never have to say
    “oh, I can’t get to that file it’s at home, or at the office”.

    The laptop alone gives me the freedom to work from home, or any other place with internet. As much as I love my laptop, I stronly urge you to learn to live without it as well. I’m a strong believer in the no laptop per meeting initiative (there should be one if there isn’t already).

    You don’t have to use a mac, or install Linux. I know people who are very happy with Windows as their primary OS/Development environment, this is development so performance is not your primary concern here.

  2. Mobile Data Cards (EVDO/HSDPA): These things are awesome, and worth it if you telecommute. Now, I’m sure you’re going to ask.. but if I work from home, isn’t my cable/dsl faster and more reliable than Sprint/Verizon/ATT? Why yes it is, but telecommuting doesn’t necessarily mean you’ll be in the comfort of your home all the time, nor does it mean that you’ll want to be. I spend more time at Starbucks, Coffee houses, and other places where the weather is nice, and the view is even nicer. None of these places have reliable Wifi, or wifi at all. But, my Sprint EVDO card works. Plus, for what I do, it’s awesome. I don’t upload massive files, I don’t download massive files, I upload incremental changes, ssh into remote machines, and I read the news with my RSS reader. I really don’t need a 10mbps internet connection. I need a stable connection, that lets me work anywhere. These card work all the time, and work every where I have been.
  3. Good headphones: I use the V-Moda head phones, they are good enough for my ears. They sound great, and fit perfectly. They’re so comfortable that sometimes I get up with the phones still in my ears, only to be yanked down by the laptop. I’m interested in bluetooth headphones, if someone has any opinions, please share them. Anyway, the reason for headphones is because they help you concentrate. They drown out the ambient noise of the environment you’re in, OR you can turn up your music as loud as you like then. I can’t work in silence, nor can I work when other people are talking. I love listening to music when I work, it helps me concentrate.
  4. An extra power cord: I have two power cords for my laptop. One is stationary and never leaves the house. The other is always in my laptop bag. This is just a peace of mind/convenience thing. I used to land in a different city, or get to the airport and find out I left my powercord at home, or at the client’s site, but no more. Part of it is due to better discipline, part of it is the intangible comfort I have when I travel, that there is always a cord at home.
  5. A good laptop bag: Like a man or a woman, is hard to find. Everyone has their own preferences, I prefer to use bags that can carry things besides my laptop. I like to take pictures, and since I normally find myself in places I find interesting, the ability to have my camera with me is a huge plus. But, sometimes I want to travel light, I want the ability to take the camera out, as well as the compartments for the laptop and lenses. The bag should transform from a large laptop bearing bag, to a smaller version of it self.
    I prefer the use of back packs, but I like the style of messenger/brief case bags. So, I’m still trying to find the perfect fit, but for now I use a Targus back pack. I’ve tried Brenthaven, Tumi, Timbuk2, and InCase bags, the Targus has been with me the longest, due to it’s simplicty, and my ability to pack my camera bag in it, when I need to.
  6. A portable hard drive: I have a Maxtor 160GB USB only drive, I bought it on sale for $80. I now have a constant backup of everything on my laptop. If I’m traveling, I just start the rsync before I go to bed. In case the events of the day cause me to lose my hard drive, drop my laptop, or I become a victim of the burning battery, my work is backed up. I am looking at Truecrypt, to see how I can make my backups a little bit more secure. If I lose/forget my drive, then my work becomes the property of Joe H4×0r, which is bad. I’d love to hear your opinions on this. I’m trying to find a balance between encrypting everything, and only encrypting the things I need encrypted.

I believe these 6 items, will help you, because they have helped me. I don’t want to talk too much in detail about how you should
live your day, or how you prioritize your tasks. There are plenty of sites out there to help you with GTD, Focusing, Time Tracking/management. Those same rules still apply, whether you work from home, or the office, you will still need to prioritize. You will still need to keep track of your hours, and you’ll still need to make time for the wife, kids, dogs, and most importantly, yourself.

MySQL: Planet MySQL