# Adding a Generic p-norm in Modern C++

In this article, I will show how to use modern C++ features such as fold expressions, variadic templates and type-traits to make a safe, stable and generic $p$-norm function.

## What the hell is a p-norm?!

So, a lot of you probably won’t ever have heard of a $p$-norm before. It’s a fairly abstract concept from linear algebra and functional analysis, and what it does is takes an $N$-dimensional vector and returns some concept of its ‘distance’ from the origin.

The most common form of $p$-norm is the $2$-norm, and is also called the Euclidean norm. Remember in high school when you had to compute the distance between two points, $\vec{a}$ and $\vec{b}$? What you were doing was computing the Euclidean norm of $\vec{b}-\vec{a}$, which is often written in mathematical literature as $\|\vec{b}-\vec{a}\|$.

Mathematically, we can write the $p$-norm as follows:

$$\|\mathbf{v}\|_{p} = \left(\sum_{i=1}^{N}v_{i}^{p}\right)^{1/p} = \left(v_{1}^{p} + \dots + v_{N}^{p}\right)^{1/p}\tag{1}$$

There are two other often used $p$-norms, the $1$-norm (aka the Taxicab norm or Manhattan length) and the $\infty$-norm (aka the max-norm). If we use our definition of the $p$-norm above, we can see what the taxicab norm does:

$$\|\mathbf{v}\|_{1} = \sum_{i=1}^{N}v_{i}$$

So the taxicab norm just gives us the sum of the components of our vector. If you imagine a taxi that has to get from point a $\vec{a}$ to point $\vec{b}$ in a grid where each of the roads are at right angles to each other (like Manhattan), it would have to travel a total distance of $\|\vec{b}-\vec{a}\|_{1}$. This is demonstrated in the image below (borrowed from Wikipedia): The green line is the Euclidean distance between the two points, and all of the other lines are different ways of achieving the taxicab distance.

The max-norm is a bit harder to understand, so if mathematical analysis is not your cup of tea, feel free to skip straight to the C++!

For those of you still with me, now to understand the max-norm. It doesn’t make sense for us to just substitute $p = \infty$ into $(1)$ – powers of infinity don’t make much sense! Instead, we have to write it as a limit:

$$\|\mathbf{v}\|_{\infty} = \lim_{p\to\infty}\left(\sum_{i=1}^{N}v_{i}^{p}\right)^{1/p}$$

That’s great, but it’s still not much use. To get anything useful out of this, we need to use the fact that we can always write $a$ as $(a \times b)/b = a/b \times b$. Now we can therefore write our equation above as:

$$\|\mathbf{v}\|_{\infty} = \lim_{p\to\infty}\left(\sum_{i=1}^{N}\left(\frac{v_{i}}{\max_{j}(v_{j})}\times \max_{j}(v_{j})\right)^{p}\right)^{1/p} = \lim_{p\to\infty}\left(\sum_{i=1}^{N}\left(\frac{v_{i}}{\max_{j}(v_{j})}\right)^{p}\times \max_{j}(v_{j})^{p}\right)^{1/p}$$

Now we can use the fact that $ab + ac + ad + \dots = a\times(b+c+d+\dots)$ to write:

$$\|\mathbf{v}\|_{\infty} = \lim_{p\to\infty}\left(\max_{j}(v_{j})^{p}\sum_{i=1}^{N}\left(\frac{v_{i}}{\max_{j}(v_{j})}\right)^{p}\right)^{1/p}$$

We know that $(ab)^{1/p} = a^{1/p}b^{1/p}$, so we can write:

$$\|\mathbf{v}\|_{\infty} = \max_{j}(v_{j})\lim_{p\to\infty}\left(\sum_{i=1}^{N}\left(\frac{v_{i}}{\max_{j}(v_{j})}\right)^{p}\right)^{1/p}$$

We can see that as we are dividing each term by $\max_{j}(v_{j})$, each element of the sum must be between $0$ and $1$ (as a number between $0$ and $1$ raised to any power $p$ must be between $0$ and $1$) and at least one element in the sum must be $1$ (as the biggest number divided by itself is $1$). Therefore, the biggest that our sum can ever be is $N$ and the smallest it can ever be is $1$.

Finally, we can see that $\lim_{p\to\infty}N^{1/p} = \lim_{p\to\infty}1^{1/p} = 1$ (try it on a calculator if you don’t believe me!) and so we can see that:

$$\|\mathbf{v}\|_{\infty} = \max_{j}(v_{j})$$

So the $\infty$-norm just gives us the biggest component of the vector. Phew, that took some work, now for a cup of tea and onto the C++!

## C++ Implementation

So now we know what we’re dealing with, how should we write some C++ code to do this in a generic way?

Well, since C++11, we have a type-safe way of getting arbitrary numbers of parameters – variadic templates! So we know that we should have something like the following:

template <typename... Args>
auto p_norm( unsigned int p, Args&&... args ) -> SOME_TYPE
{
// ...
}


What should the type be? Well, also since C++11, we have had std::common_type<...>::type which gives us a useful, generic way to get the so-called ‘common type’ of a set of types. The exact wording in the standard refers to the result of the ternary operator, but in general, for arithmetic types std::common_type<T0,T1,T2,...>::type has the same type as T0() + T1() + T2() + ..., which, thanks to automatic type promotion gives us at least the type with the largest numeric range in the set. However, this will cause us to return an integer if we pass a set of integers!

In order to get around this problem, we can create a templated struct whose type is double if its template parameter is integral, static asserts if it’s given a non-arithmetic type, or returns the type it has been given otherwise. Let’s call it PromoteToFP and implement it as follows:

template <typename T, bool IsIntegral>
struct PromoteToFP_impl
{
using type = T;
};

template <typename T>
struct Promote_impl<T, true>
{
using type = double;
};

template <typename T>
struct PromoteToFP
{
private:
using stripped_type = std::remove_cv_t<std::remove_reference_t<std::remove_cv_t<T>::type>::type>::type;

public:
static_assert(std::is_arithmetic<stripped_type>::value, "Cannot promote a non-arithmetic type to floating-point");
using type = typename PromoteToFP_impl<stripped_type, std::is_integral<stripped_type>::value>::type;
};

template <typename T>
using PromoteToFP_t = typename PromoteToFP<T>::type;


So now we can give our p_norm function the following nice prototype:

template <typename... Args>
auto p_norm( unsigned int p, Args&&... args )
-> std::common_type_t<PromoteToFP_t<Args>...>
{
// ...
}


Now we’ve got our return type, how do we actually implement this function?

Well, before C++17 the idea would be to implement it in a recursive fashion, computing the power of each argument in turn and summing them together. However, C++17 introduces fold expressions. This awesome language feature allows us to perform operations on each of the elements of a variadic parameter pack and chain each of the results together using a binary operation. I strongly recommend you read more about it at cppreference!

Using this, what would be a messy and somewhat arduous exercise becomes so much simpler:

template <typename... Args>
auto p_norm( unsigned int p, Args&&... args )
-> std::common_type_t<PromoteToFP_t<Args>...>
{
using result_type = std::common_type_t<PromoteToFP_t<Args>...>;

return std::pow( (std::pow(static_cast<result_type>(args), p) + ... ), 1.0/static_cast<result_type>(p) );
}



This is great, we’ve implemented a working $p$-norm in C++17! But we’re not quite finished yet – we still need to consider numerical stability, and it’s good practice to make sure we handle all of the edge cases.

So what are the edge cases? Well, as it stands at the moment, we can call the function with 1 argument, and we can see that the $p$-norm of a 1-dimensional vector (i.e. a scalar) should just return its only component.

Using the constexpr-if functionality also brought in with C++17, we can handle the $1$-dimensional case explicitly:

template <typename... Args>
auto p_norm( unsigned int p, Args&&... args )
-> std::common_type_t<PromoteToFP_t<Args>...>
{
using result_type = std::common_type_t<PromoteToFP_t<Args>...>;

if constexpr(sizeof...(args) == 1)
{
return (args,...);
}
else
{
return std::pow( (std::pow(static_cast<result_type>(args), p) + ... ),
1.0/static_cast<result_type>(p) );
}
}


So now we’ve taken care of that, is there anything else that we can improve? The first thing to notice is that we’re taking potentially very large powers of potentially very large numbers, so overflow is quite likely! We can avoid this by using dividing each of the elements by the largest element and then multiplying the end result by the maximum element.

To do this, we need to write ourselves a variadic max function, let’s call it v_max to avoid confusion and potential clashes with the C++ standard library’s std::max. We can implement it like this:

template <typename T, typename U, typename... Args>
constexpr auto v_max( T&& t, U&& u, Args&&... args )
-> std::common_type_t<T, U, Args...>
{
if constexpr(sizeof...(args) == 0)
{
return t < u ? u : t;
}
else
{
return v_max( t, v_max( u, args... ) );
}
}


Using this, we can improve the stability of our p_norm:

template <typename... Args>
auto p_norm( unsigned int p, Args&&... args )
-> std::common_type_t<PromoteToFP_t<Args>...>
{
using result_type = std::common_type_t<PromoteToFP_t<Args>...>;

if constexpr(sizeof...(args) == 1)
{
return (args,...);
}
else
{
const auto max = v_max( static_cast<result_type>(args)... );
return max * std::pow( (std::pow(static_cast<result_type>(args)/max, p) + ... ),
1.0/static_cast<result_type>(p) );
}
}


## Conclusion

So there we have it, a numerically stable implementation of a $p$-norm using modern C++ techniques and idioms.

Thanks for reading, I hope that you enjoyed this blog post, please let me know if you have any comments/corrections/critiques!

Note: This has been compiled and tested using Clang 3.9, which as far as I know is the only compiler to currently support all of the C++17 features used in this article.