It is the totality of operations you can apply to it. The Yoneda lemma, translated for people who work with tables.
There is a habit, almost a reflex, of thinking of a dataset as a collection of rows. Each row is a vector; each vector is a point in some ℝd; the dataset is the set of these points. Statistics, machine learning, and even quantum encoding all begin from this picture. The picture is not wrong, but it is misleading in a specific way: it suggests that the data point is the primary object, and that the operations one performs on it — distances, kernels, transformations, encodings — are secondary, applied to the data from outside.
The Yoneda lemma reverses the priority. It says, in plain words, that an object is determined entirely by the totality of arrows into it (or out of it). What you can do with X is what X is. Two objects with identical patterns of incoming morphisms are indistinguishable; two patterns of incoming morphisms determine the object up to isomorphism. The object dissolves into its operational role.
For an object X in a category 𝒞, the functor hom(−, X) : 𝒞op → Set determines X up to isomorphism.
Translation: tell me what arrows land in X, from every other object, and I will tell you exactly what X is.
A row of a table, looked at in isolation, has no content. The number 3.7 in the column "petal length" is meaningless without the operations that come with it: comparison with other rows, scaling, distance to another row, projection onto a learned feature, encoding into a quantum state. None of these operations is "applied" to the row. Each of them is, in fact, a morphism from the row into some other space — into the real line, into a Hilbert space, into a representation, into a class label.
The dataset is the totality of these morphisms. What you call "the data" is a shorthand for the family of admissible maps that you have decided to consider.
Practitioners do this without naming it. The Yoneda perspective is implicit in nearly every modern technique that treats raw values with suspicion.
In each case the row of the table is no longer the primary object. The relations are.
If a data point is its hom-functor, then two questions that look identical separate.
The first is "what is in the dataset." The second is "what class of maps am I willing to call admissible." The first is fixed by collection; the second is a free parameter, and it is where all the modeling happens.
Choose only linear maps and you get classical statistics. Choose smooth maps respecting a group action and you get equivariant ML. Choose unitary maps into a Hilbert space and you get a quantum encoding. Choose unitary maps into a Fock space whose creation operators carry physical meaning, and you get QIFT. The data has not changed. The category in which it lives has.
Stop asking what the data is. Ask what the admissible morphisms are. The first question has no operational answer; the second is a modeling decision, and every consequential choice in machine learning, statistics, and quantum encoding turns out to be an answer to it in disguise.
The Equivalence Theorem of QIFT, in this language, says: the class of probability-loading-followed-by-fixed-unitary encodings is operationally a single class. Different maps within it are isomorphic in the relevant sense. To get genuinely new behavior, the class itself must change.
That is Yoneda for practitioners. Not a theorem you apply, but a question you ask before everything else.