In MySQL, COUNT() and MAX() are functions that can be used with grouped data. To understand how to use them, consider the following examples.
Suppose you are managing the database for a store with various items for sale. You have a table called “items“, with columns for a unique identifier, price, manufacturer ID, and other information. You also have a table called “manufacturers” with info on each of the manufacturers sold in the store; columns include a unique identifier and a name. The manufacturer ID in the items table corresponds with the matching row in the manufacturers table.
Example 1: Using COUNT
If you want to find out how many items of each manufacturer are in the database, you could use a query like this:
The results of this query might be:
The COUNT() function is an aggregate function that works on a group of rows defined by a GROUP BY clause. When you use the GROUP BY clause, MySQL groups all result rows that have the same value for the specified column (in this case, manufacturer_id). In other words, it returns a row for each distinct value in that column. Aggregate functions like COUNT() give you information about the members of the grouped rows. The asterisk in COUNT(*) means every member of the group is counted. Looking at the first result of the query, we can infer that the items table contains 12 rows whose manufacturer_id equals 1, 9 rows whose manufacturer_id equals 2, and so on.
Example 2: Using MAX
MAX() is another aggregate function used with a GROUP BY clause. It simply returns the maximum value of a specified column within a group. Suppose, for example, that instead of finding the number of items for each manufacturer, you want to find the price of the most expensive item for each manufacturer. You could use a query like this one:
The results of this query might look like this:
Example 3: Another Way to Compute the Maximum Value
There is another way to find the maximum: by ordering rows from greatest to least (according to the relevant column), and using a LIMIT clause to return only the first row. For example, to find the manufacturer that has the most items, just add an ORDER BY clause (with descending order) and a LIMIT 1 clause to Query 1:
The result would be:
Example 4: Including Multiple Rows in the Output
Note, however, that only one result will be shown by this query, even if two or more manufacturers are tied for the most items. To get around this problem, add a HAVING clause to the first query above. This allows you to filter the result rows by allowing only those rows whose count column matches the maximum. To find that maximum, use a subquery similar to Query 3, where you used an ORDER BY and a LIMIT clause. The whole query would look like this:
The result of this query would be the same as the previous result, since in this example, only one manufacturer has the highest number of items, 23. But if there were another manufacturer, say “Ties Inc”, with 23 items in the database, then the result would be:
In this query, the subquery that finds the maximum item count is not dependent on the containing query. So the MySQL server computes it once, ahead of the containing query; the resulting value of 23 is then used in the HAVING clause as a filter against the aggregate rows produced by the main query.
Example 5: Using MAX and COUNT Together
Now consider a slightly more complicated example. You want to find out, for each manufacturer, what their highest item price is, and how many of their items have that highest price. You will again be drawing results from the items table, grouped by manufacturer, but you’ll also use a subquery to compute, for each manufacturer, what their highest price is. The result of this subquery will be used in a WHERE clause to limit the items counted per manufacturer to those that have the highest price.
The query looks like this:
The result would be:
Note that unlike the previous subquery, this one is dependent on the containing query, because it uses i.manufacturer_id. (It is therefore known as a correlated subquery.) The MySQL server cannot compute it ahead of time as in the previous example; it will be computed for every row generated by the containing query. This may pose performance issues in larger databases.