The most obvious cost related to packaging is direct material, but the largest cost associated to packaging size is transportation. This is especially true for products shipped in large volumes and subjected to dimensional weight calculations.

There are many contributing factors to package size, such as product size/fragility, marketing demands, and supply chain hazards. Perhaps the number one reason for inappropriately large package size is the type of laboratory drop tests employed. Overly harsh laboratory tests (ones that don’t accurately reflect the true hazards found within your supply chain) can cause excessive packaging, thus increasing packaging and logistics costs. If your company uses any of the well known distribution simulation tests from the International Safe Transit Association (ISTA), American Society of Testing and Materials (ASTM), FedEx, and others, then you may well be contending with excessive damage, packaging, and logistics costs, or, sadly, a combination of all these conditions simultaneously. If tests are overly harsh in some regards, and completely miss certain damaging inputs commonly found in distribution, then one ends up with both high damage rates and excessive packaging. To top it all off, none of these standards help limit excess packaging.

Many companies employ tests like ISTA 3A (Packaged Products for Parcel Delivery, 150 pounds or less), ISTA 3B (Packaged Products for LTL Shipments), ASTM D4169 (Performance Testing of Shipping Containers), ASTM D7386 (Performance Testing of Packages for Single Parcel Delivery), FedEx’s “Testing Packaged Products up to 150 pounds,” and most recently added, ISTA’s Project 6 tests. It’s interesting to note that although these tests seemingly deal with many of the same shipping environments, the tests are different from each other. Why would that be? What would a shipper do if their products pass one test but not another? Perhaps more importantly, though, is this: What are the shared shortcomings of all of these tests, and how do these shortcomings translate into extra costs for many, if not most, companies?

Testing Doesn’t Always Lead to Lower Costs

The series of tests listed in all of the standards are based on lots of research. Modern research of handling inputs is based on recordings taken with high tech data acquisition recorders that measure shock, vibration, temperature, humidity, and even GPS coordinates. Though the technology is impressive, there’s a basic flaw in most studies that attempt to capture drop height data. Unlike vibration, temperature, and humidity, which can be measured continuously, free fall drops happen infrequently. Also, the vast majority of studies use dummy packages, meaning the text package is a box of a certain weight and size, but there’s nothing to break on the inside, unlike products being shipped. In other words, the measurements may be highly accurate, but there is no correlation to actual known, consistent field failures. What happens if the recorder captures a 100” drop? These drops certainly do happen in distribution, but without damage correlation, one would never know if the recorded drop is a truly significant issue. Conversely, just how many measurements would one need of a supply chain to confidently say you now know the parameters of drops? It would take hundreds of such dummy product measurements, yet no study has been this extensive.

In contrast, some companies have come to rely on careful collection of damage data, coupled with direct field observations. Using literally millions of their own products as data acquisition recorders, these companies then replicate the known consistent field damages in the lab, essentially measuring the environment by then reproducing the effects of the environment in the lab, and then setting these inputs as baselines for all products to meet as they are be shipped through similar supply chains.

Other than the lack of damage correlation in the studies that formed the standards, what other shortcomings are there? Consider:

No two standards are the same. Is one better than another?

None test all 26 orientations of a box, and yet the distribution system clearly does. Do these organizations know only their 6 or 10 or 11 or 17 test orientations are the ones that will be dropped upon?

Some standards demand all 6 or 10 drops from the maximum height, yet no research has ever found 6 or 10 max height drops in distribution for a single package. In fact, there’s only a small chance of a single max drop in distribution.

Every standard has only one sequence of drops. In other words, whether you test a single unit or a thousand units, the sequence order of drops is to be exactly the same each and every time. Those of us who’ve spent considerable time in test labs know the sequence of drops may be the difference between pass and fail. The fact is this: the sequence of drops found in distribution is random, and this randomness leads to failures. Why are there no drop tests that employ multiple drop sequences?

Perhaps most importantly, in terms of costs, is the fact that no standard provides a pathway to design-margin testing, where one could define the amount of excess packaging being used. In other words, all of these tests are pass/fail. Many companies want only to pass the tests and then declare they’re ready for production, but wouldn’t it be worthwhile to know if a drop of an additional inch causes damages? Wouldn’t it be good to know the most likely thing to fail in distribution if inputs surpass the lab test levels? Conversely, wouldn’t it be important to know if the package that passed a 30” drop test also passes a 50” drop test, clearly indicating excessive packaging?

None of the standards suggest appropriate levels of robustness for products. One reason for excessive packaging can be certainly be poor packaging design, but overly large packages can also be reflective of products that aren’t designed with logistics costs in mind. When product designers partner with packaging engineers who can suggest modifications in both geometry and fragility, the minimum landed costs can be achieved for packaged products.

Besides the above questions, one should also wonder if these standards are good enough for worldwide distribution. For instance, if you ship products in India and Asia, would you expect the standard tests to suffice for parcel shipments on the other side of the planet? On the other hand, if you ship products to India and have absolutely no damage, and you use the same packaging for the entire world, then would that be a clear indication of excessive packaging for the US, Europe, and Japan? In other words, it’s not only if your packages pass or fail their tests; it’s time to really consider what these results mean for your packaging strategy.

Kevin Howard is a consultant with and owner of Packnomics LLC. His focus is on distribution packaging design and testing. He will speak at this year’s PARCEL Forum on Tuesday, October 29 at 12:00 PM. He can be reached at

This article originally appeared in the September/October, 2019 issue of PARCEL.