Here is an ordered pair: {a, {a, b}}, which can be written in shorthand as (a, b). The set itself is ordered by the relation of belonging: a belongs to {a, b}, but {a, b} does not belong to a. The first item in the pair is the “smaller” of the two elements in the set, and the second is the remainder of the “larger” element once the smaller element has been removed from it. (If the pair is (a, a), i.e. the first and second items are the same, then the set is {a, {a}}).
What makes the set {a, {a, b}} an ordered pair? Ultimately, it is the pair of operations that extract the first and second items of the pair respectively. Another way of representing the same pair (a, b) is with a set in which the items are “tagged” with indices, like this: {{a, 0}, {b, 1}}. Again there is a pair of operations that will retrieve a and b respectively from this set.
To say that these two sets are both “representations” or “instantiations” of the same object, is to define the object itself not in terms of what set it “is” but in terms of what can be done with it: what mappings it supports.
Take the case of the product of two sets, p and q. This is the set of all of the ordered pairs that can be made with elements of p in the first position and elements of q in the second. If p={1, 2} and q={a, b}, then their product p * q = {(1, a), (2, a), (1, b), (2, b)}. We have seen that there are at least two forms that these ordered pairs can take. What makes the set p * q the product of p and q is not its particular form, but the relationships it has to p and q, and the particular relationship it must have with any other set that has mappings to p and q.
The standard set-theoretic representation of a mapping is its graph, given as a set of ordered pairs of the mapping’s inputs and outputs. For example, given the function f(x) = x * 2 ranging over the positive integers, the set representing this function is {(0, 0), (1, 2), (2, 4), (3, 6)…}. This captures the sense of a totally regular relationship between the input of a function and its output, but not the operational sense in which a function works on its input to produce an output. And we cannot present the graph of a function in this manner without already having defined an ordered pair in terms of its characteristic properties, or in other words the operations it supports. This isn’t a particularly vicious circularity, but it does indicate that there’s something about the concept of what category theory calls a morphism between objects that the set-theoretic representation of a function as its graph does not fully capture.
While it is true to say that set theory enables us to build relationally complex structures (partial orders, topologies, groups, rings, sheaves and so on), as we saw with the example of the ordered pair the salient properties of these structures are defined over them rather than intrinsic to them. Set theory provides the means of definition (e.g. the ability to say things like “a topology on a set is any set of subsets of that set that contains the set itself and the empty set, and is closed over union and intersection”), and the whole of the “matter” that supports such definitions. It lays out the entire operational field of (all?) mathematics, whereas category theory discriminates objects within that field in terms of the morphisms between them.