Posts Tagged ‘database’

Oracle Still Top Dog in Tough Database Market

Monday, June 30th, 2008

The field of commercial relational database vendors is a lot less crowded than it used to be, and it’s no surprise, considering the players have to contend with a massive software juggernaut like Oracle. According to the latest numbers from research firm IDC, Oracle still ruled the roost in databases in 2007, capturing in excess of 44 percent of the overall market.

Not even Oracle can afford to rest on its laurels, however; not when the database market remains this competitive. In addition to pressure from the other two top proprietary vendors — IBM and Microsoft — Oracle must contend with increasing competition from open source software. For example, last week Sun Microsystems, which acquired MySQL in January, announced an aggressive new pricing structure that allows customers to install as many instances of the open source database as they want, including enterprise-class service and support, for a single, flat rate.

Included in the deal is Sun’s GlassFish Java application server, which can be used to host custom enterprise applications that store their data in the database. Pricing reportedly begins at US$65,000 per year and scales up based on the number of employees in the organization. (Sun already uses similar, headcount-based pricing for much of its software portfolio.)

If that sounds like a lot of money, consider that the latest pricing for the Oracle 11g database starts at around $47,500 per CPU, following a price hike that took effect earlier this month. By comparison, Sun is offering site-license pricing — you can install MySQL on as many CPUs as you want for the one rate.

MySQL can’t compete with Oracle on a feature-for-feature basis, especially when it comes to the advanced capabilities needed by heavy enterprise users, such as data integrity and replication. But many applications don’t need the high-end features offered by top-tier database. For example, many Web applications need nothing more than simple data storage, which MySQL offers in spades.

It can be difficult to properly analyze MySQL’s true market share, because you don’t have to be a Sun Microsystems customer to use it. MySQL is open source, so you can generally download and use the database for free (although some licensing restrictions may apply). Even if it were possible to count every single instance of MySQL that is currently in use, there’s no way of knowing how many of those users represent potential business for Sun.

As a rule, however, users who have extensive experience using open source software for prototype or “off the record” projects are good candidates to become paying customers of open source vendors in the future. What they get for their money is commercial-grade support, which can be invaluable when open source software is used to power mission-critical applications. Open source support contracts usually come at much lower price tags than equivalent offerings from proprietary software vendors, such as Oracle.

MySQL isn’t the only low-cost contender on the market, either. PostgreSQL is similarly open source, and offers a feature set that’s more comparable to Oracle, IBM DB2, or Microsoft SQL Server. Given how easy it has become to install and use a database for free, it’s entirely possible that relational databases may soon become a commodity market, especially among those mid-tier customers who don’t need the most advanced capabilities.

What Did the Database Do?

Wednesday, June 4th, 2008

Of course, this is not the whole story. Much more happens behind the scenes than simple conversational requests and acknowledgements.

The first interaction the register had with the database occurred when the request for the product name and price was processed. Let’s take a look at how that request was really handled.

If the database is an SQL database, like MySQL or PostgreSQL or many others, then the request would be transmitted in the standard Structured Query Language (SQL). The software running on the register would send a query to the database that looks similar to the following.

SELECT name, price FROM products WHERE id = 885909054336;

This query instructs the database to look in the products table for a row (also called a record) in which the id column exactly equals 885909054336.

Every database may contain multiple tables, and every table may contain multiple rows, so specifying the name of the table and the row’s unique identifier is very important to this query. To illustrate this, an example of a small products table is shown in Figure 2.

When the database has successfully found the table and the row with the specified id, it looks for the values in the name and price columns in that row. In our example, those values would be “iPod” and “200.00”, as seen in Figure 2. The execution of the previous SELECT statement, which extracts those values from the table, is shown in Figure 3.

Sponsored Links

The database then sends a message back to the register containing the product’s name and price, which the register interprets and displays on the screen for the cashier to see.

What is a Database?

Wednesday, June 4th, 2008

When you are in a big electronics store buying the latest edition of the iPod, how does that store’s inventory tracking system know you just bought an iPod and not, for example, a car stereo or a television?

Let’s walk through the process of buying an iPod and consider all the implications this has on the inventory database that sits far underneath all the shiny, new gadgets on the sales floor.

When you hand the iPod box to the cashier, a barcode scanner reads the label on the box, which has a product identification number. In barcode language, this number might be something like 885909054336. The barcode representing this number can be seen in Figure 1.

Figure 1. A sample barcode

The barcode acts as a unique identifier for the product; in this case, all iPods that are the same model as the one passing across the barcode reader have the same exact barcode.

The barcode scanner relays the number represented by the barcode to the register at the cashier’s station, which sends a request (or a query) to the store’s inventory database. This database could be in the same store as the register or somewhere across the country or even around the world, thanks to the speed and reliability of the Internet.

The register asks the database, “What are the name and price of the product that has this barcode?” To which the database responds, “That product is an iPod, and it costs $200.”

You, the customer, pay your $200 and head home with a new toy. Your work in the store is finished, but the inventory management system still needs to reconcile your purchase with the database!

When the sale is complete, the register needs to tell the database that the iPod was sold. The ensuing conversation goes something like the following.

Register: “How many products with this barcode are in our inventory?” Database: “1,472.”

Register: “Now, 1,471 products with this barcode are in our inventory.”

Database: “OK.”

What are the steps involved in Instance Recovery?

Tuesday, June 3rd, 2008

Rolling forward to recover data that has not been recorded in data files, yet has been recorded in the on-line redo log, including the contents of rollback segments. Rolling back transactions that have been explicitly rolled back or have not been committed as indicated by the rollback segments regenerated in step a. Releasing any resources (locks) held by transactions in process at the time of the failure. Resolving any pending distributed transactions undergoing a two-phase commit at the time of the instance failure.

Is VARCHAR(80) faster than VARCHAR(255)

Tuesday, June 3rd, 2008

Yes, but probably not in the way you think. AFAIK, there’s no difference (at least in recent versions of MySQL and InnoDB) between the speed at which VARCHAR columns of different lengths are retrieved from disk or memory. However, there is a big difference when either of the following scenarios occurs:

A temporary table is implicitly created to handle a GROUP BY or ORDER BY clause and a VARCHAR column is in the SELECT statement

A temporary table is created explicitly which contains a VARCHAR column
In these cases, the length of the VARCHAR columns does come into play. Why? Because temporary tables in memory are actually just tables of the MEMORY storage engine. The MEMORY storage engine, for some reason, treats all VARCHAR(X) columns as CHAR(X) columns.

This means that if you define two fields, one as VARCHAR(255) and another as VARCHAR(128), the latter will consume half as much space when allocated in a temporary table. The more records can fit into the max_heap_table_size, the fewer cases of swapping to disk tables (look for SHOW STATUS LIKE ‘Created_tmp_disk_’;) will occur, resulting in better overall performance.