Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

C++ Views: Can You Skip Adjacent Duplicates?

Looking for a non-destructive way to skip adjacent duplicates in C++? Learn what std::views offers and how to simulate views::unique.
C++ Ranges and Views thumbnail illustrating skipping adjacent duplicates in view pipelines using modern C++20 techniques C++ Ranges and Views thumbnail illustrating skipping adjacent duplicates in view pipelines using modern C++20 techniques
  • 🧪 std::views::unique is not currently part of the C++20 or C++23 standard libraries.
  • 🧱 std::unique is unsuitable for immutable data as it modifies the original range in place.
  • 🧠 Stateful functions mimic views::unique behavior without changing data.
  • ⚙️ Custom filters work well in pipelines. They skip duplicates efficiently and lazily.
  • 🔨 A reusable unique_filter template works for many different element types to remove adjacent duplicates.

C++ std::views: Skipping Adjacent Duplicates with views-like Composability

You might want a simple way to combine features to skip adjacent duplicates in C++. You might also want something like a views::unique. But this feature is not yet in the standard library. But you still have choices. C++20 added std::views, and C++23 made them even better. They give you power. You can use new C++ methods, like your own view tools and callable filters, to mimic adjacent deduplication. And you keep your original data as it is.


What Is std::views in Modern C++?

C++20 added C++ Ranges. This is a basic part of the language. It changed how we handle sequences, like std::vector and std::list. A main part of this is std::views. They let us work with existing ranges lazily. And they give us a way to combine steps. We do this without taking ownership of the data or changing it.

Key Traits of std::views

  • Can Be Combined: You can link views together like a pipeline. This lets you filter, map, and slice data easily. The code is also easy to read.
  • Lazy Work: Work only happens when you need the results. This stops extra processing and memory use.
  • No Copying: Views work with references to the original data. They do not create new data. This keeps performance good and memory use low.
  • No Changes to Data: Views do not change or own data. They change how data is seen without causing other issues.

These traits make std::views very useful. This is true when you work with real-time data, big datasets, or apps where speed matters a lot.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


Defining the Problem: Skipping Only Adjacent Duplicates

Removing duplicates from a dataset is often needed. This applies whether you are:

  • Cleaning up repeating log entries.
  • Filtering sensor data in IoT apps.
  • Stopping repeated user actions in a series.

But you need to tell the difference between complete deduplication and adjacent duplicate removal:

  • 🌀 Global Deduplication: This removes all repeated values from a collection. It does not matter where they are. You can use std::set or sort the data for this.
  • 🔄 Adjacent Deduplication: This only removes duplicates that are right next to each other. For example, [A, A, B, A] becomes [A, B, A].

Adjacent deduplication depends on what's around it. It is often needed when processing data as it comes in. In these cases, the order of items must stay the same. And duplicates happen due to small, quick changes.

Why views::filter Isn’t Enough

A common way people try to do this is:

std::views::filter([](int x) { return /*...*/ });

This looks good at first. It uses a lambda that filters out duplicates. But filter can't see the previous element. Each element is handled on its own. So, you can't compare items that are next to each other using a lambda without memory.


The Ghost of views::unique

You won't find std::views::unique in C++20 or C++23. Programmers who expect a new version of std::unique that uses views are often surprised it is not there.

People are talking about this in current plans to add it to the standard. For example:

  • P2321R0: This suggests views::unique and related tools to be added.
  • The paper explains why we need it, why it's hard to build, and what good it would do. But it's just an idea. It is not official yet.

Why Isn’t It Included Yet?

  • † Making a views::unique that works easily for all kinds of data and uses is hard.
  • † Keeping track of things inside (like the "previous element") in a view accessor is not easy. And it needs to work safely with multiple threads.
  • † The people who build compilers and the STL must agree on how it should act in all special situations. This takes time.

So do not expect it soon. If you want adjacent deduplication today, you’ll need to implement it yourself.


Why std::unique Isn't Viable for Views

The std::unique algorithm does what its name says. But it is not good for lazy uses or cases where data should not change. This includes std::views.

Key Problems with std::unique

  • It Changes Data: It changes the order of elements to put duplicates together. Then it moves the others.
  • It Needs Changeable Access: It works by looking at and changing the container. This goes against rules for data values or using const.
  • Needs an Extra Step: After you call std::unique, you still need to use erase. This makes the container appear smaller.

Example:

std::vector<int> data = {1, 1, 2, 3, 3, 4};
auto it = std::unique(data.begin(), data.end());
data.erase(it, data.end());  // Removes the trailing junk

But std::views is about seeing data without breaking it. You want to see data that is filtered and has duplicates removed. You do this without changing the original container.


Simulating views::unique with Stateful Filters

Adjacent deduplication always needs context. This means the last item seen. Lambdas can't easily keep that context. This is because they have a short life and no memory when copied.

Why Lambdas Fail

Even if you try something like:

int last = -1;
auto view = data | std::views::filter([&](int x) {
    bool keep = x != last;
    last = x;
    return keep;
});

This might work in simple cases. But it will not always work right in pipelines. Why?

  • Views and the lambdas they hold often get copied.
  • The local last might not stay between calls. This can cause wrong or unpredictable results.

The Proper Solution: Stateful Functor

A standard way to safely keep track of things between filter calls is to use a callable object. This is a struct that keeps track of its own data.

Example Implementation

struct AdjacentSkipper {
    std::optional<int> last;

    bool operator()(int value) {
        if (last && *last == value) return false;
        last = value;
        return true;
    }
};

Usage:

std::vector<int> data = {1, 1, 2, 2, 3, 3};

auto filtered = data | std::views::filter(AdjacentSkipper{});

for (int i : filtered)
    std::cout << i << ' ';

Output:

1 2 3

This works almost exactly like a built-in views::unique. But you built it yourself. This gives you control that you can rely on and is safe.


Applying to Other Types (Strings, Structs, etc.)

You can make the functor a template. This makes it easy to use with more complex types, like strings or structs:

struct SkipDuplicateLogs {
    std::optional<std::string> prev;

    bool operator()(const std::string& msg) {
        if (prev && *prev == msg) return false;
        prev = msg;
        return true;
    }
};

Pipeline:

std::vector<std::string> logs = {"error", "error", "warn", "warn", "error"};

auto result = logs 
    | std::views::filter(SkipDuplicateLogs{}) 
    | std::views::transform([](const std::string& s) {
        return "[LOG]: " + s;
    });

for (const auto& log : result)
    std::cout << log << '\n';

Building a Reusable unique_filter<T> Template

Making code generic makes it reusable. Here's one way to abstract the deduplication functor:

template <typename T, typename Compare = std::equal_to<>>
struct unique_filter {
    std::optional<T> prev;
    Compare comp;

    bool operator()(const T& val) {
        if (prev && comp(*prev, val)) return false;
        prev = val;
        return true;
    }
};

Now you can use for int, std::string, or custom types:

auto filtered_ints = vec | std::views::filter(unique_filter<int>{});
auto filtered_strs = logs | std::views::filter(unique_filter<std::string>{});

And with a custom comparator:

struct CI_Compare {
    bool operator()(const std::string& a, const std::string& b) const {
        return std::equal(a.begin(), a.end(), b.begin(), b.end(),
            [](char c1, char c2) {
                return std::tolower(c1) == std::tolower(c2);
            });
    }
};

auto filtered = logs | std::views::filter(unique_filter<std::string, CI_Compare>{});

Performance and Memory Efficiency

Using views to filter adjacent duplicates in a pipeline has clear benefits. These are better than older ways of doing things:

Metric std::unique + erase std::views::filter + functor
Changes Memory Yes No
Lazy Work No Yes
Avoids Copies Moderate High (works with references)
Can Be Reused Limited Can be used in more pipelines
No Memory Allocation No (unless made faster) Yes

C++ views are very good when lots of fast data needs quick processing. This includes embedded systems, signal processing, and real-time analytics.


When to Use range-v3

Eric Niebler’s range-v3 is the library where most Ranges features started. It came before they were added to the C++ standard. It has its own view::unique.

Pros:

  • Ready-to-use view::unique.
  • Works like the standard version that is planned.
  • Good for building quick test versions. And it is good if you already use range-v3.

Cons:

  • Adds extra work due to dependencies (build system, binaries, CI).
  • Slows down compilation because of complex code generation.

Use range-v3 if:

  • Your project already uses it.
  • You need more advanced or test view tools.
  • You prefer proven and complete code over your own custom code.

Best Practices Summary

  • ✅ Use stateful functors instead of lambdas to keep track of context.
  • ✅ Make functors not depend on outside data. And make them keep their own data inside.
  • ✅ Use views::filter for filtering that does not change data and waits until needed.
  • ❌ Do not use lambdas that capture state. They will not work in a clear way.
  • 🚫 Do not share functor state between views unless made to work together well.

Will views::unique Become Standardized?

There is ongoing interest in adding this tool to the standard. Proposal P2321R0 clearly describes a plan for views::unique and related filters.

What to Expect:

  • A consistent interface like std::unique, but one that does not change data.
  • It will look at both adjacent and global deduplication.
  • It might be added in C++26, but no final decision has been made.

Until then, you can combine std::views::filter and well-made functors. This gives you all the options you need today.


Conclusion

Adjacent deduplication is a common but tricky task in C++ apps. This is especially true when speed, data accuracy, and new design ideas all meet. views::unique is still just an idea. But today's C++ world, made better by std::views, gives programmers tools. These tools help them build their own clean, safe, and reusable solutions. You can understand the problems with lambdas in lazy pipelines. And you can use callable objects that hold their own state. This lets you make your code act like views::unique for almost any type. You can use filters you make yourself. Or you can use libraries like range-v3. Either way, C++20 and C++23 give you the power to write clearer and faster code. The versatility of C++ Ranges is not just in what is standard. It is also in what you can build on top of those standards.


References

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading