I believe the answer to your question revolves around correcting a subtle confusion between classes and sets in the Cumulative Hierarchy. This can be shown by reference to Samuel Coskey's Senior Thesis, "Partial Universes and the Axioms of Set Theory", found under title on the Web on SemanticScholar. This thesis shows which $ZFC$ axioms hold at each stage of the Cumulative Hierarchy. Here is a short synopsis of his results:

Axioms that always hold (at each stage of the hierarchy): Extensionality, Foundation, Union, Axiom Schema of Separation, Choice.

Axioms that holds in $V_{\alpha}$ iff $\alpha$ $\gt$ 0: Empty Set.

Axioms that hold in $V_{\alpha}$ iff $\alpha$ $\gt$ $\omega$: Infinity.

Axioms that hold at limit ordinals: Power Set, Pairing.

Axioms that hold at inaccessible cardinals: Replacement.

(Note that although the existence of the empty set can be treated as an axiom, it can also be derived from Separation as follows:

{$y$ $\in$ $x$ | $y$ $\ne$ $y$})

Coskey shows, in his thesis:

Theorem 5.8. The Empty Set axiom holds in $V_{\alpha}$ iff $\alpha$ $\gt$ 0.

Proof. $V_0$ = $\emptyset$ (= { }), so nothing postulating the existence of a set holds. On the other hand, $\emptyset$ $\in$ $\mathcal P$($\emptyset$) = $V_1$.

Before I continue further, let me recall Elie's conundrum as stated in his question:

In order to define the Universe of Sets we must begin with a concept of ordinals, but in order to define the ordinals we need to have a concept of the Universe of Sets!...So my question is to ask: Is this definition circular?

Coskey's Theorem 5.8 seems (at least to me) to suggest that the conundrum is real because (to repeat)

$V_0$ = $\emptyset$, so nothing postulating the existence of a set holds. On the other hand, $\emptyset$ $\in$ $\mathcal P$($\emptyset$) = $V_1$.

yet $V_0$ = $\emptyset$ seemingly postulates the existence of the 'empty set' outright. Since one can derive the Empty Set Axiom from Separation, it is instructive to look at Coskey's proof that Separation always holds at each stage of the cumulative hierarchy:

Axiom 6 (Separation). If $x$ is a set and $y$ is a class such that $y$ $\subset$ $x$, then $y$ is a set.

Theorem 5.6. The Separation axiom schema holds in all $V_{\alpha}$

Proof. Suppose $x$ is an element of $V_{\alpha}$. If $y$ is any class at all which is a subset of $x$, then we have $y$ $\subset$ $x$ $\in$ $V_{\alpha}$ an so by Corollary 4.6 [If $y$ $\subset$ $x$ and $x$ $\in$ $V_{\alpha}$, then $y$ $\in$ $V_{\alpha}$], $y$ $\in$ $V_{\alpha}$ as well.

But now consider the proof at $V_0$ = $\emptyset$. By Coskey's definition of $\subset$, $\emptyset$ $\subset$ $\emptyset$ so that $\emptyset$ $\in$ $\emptyset$, which seems to defy Foundation. With this in mind, I now consider Coskey's rendition and proof that Foundation holds for all $V_{\alpha}$:

Axiom 10 (Foundation). If $x$ is a set, then there is a $z$ $\in$ $x$ such that $z$ $\cap$ $x$ = $\emptyset$

Theorem 5.2. The Axiom of Foundation holds for all $V_{\alpha}$.

Proof. If $x$ $\in$ $V_{\alpha}$, then there is a $z$ $\in$ $x$ such that $z$ $\cap$ $x$ = $\emptyset$. Since $V_{\alpha}$ is transitive, in fact $z$ $\in$ $V_{\alpha}$ as well.

Again, consider the proof of Theorem 5.2 for $V_0$. Since $V_0$ = $\emptyset$, obviously by definition of $\emptyset$ ($\exists$$y$$\forall$$z$($z$ $\notin$ $y$)), the proof of foundation fails for $V_0$ but the empty set axiom ensures that $\emptyset$ $\notin$ $\emptyset$. In fact, if one were to assume that $\emptyset$ $\in$ $\emptyset$, this would contradict Coskey's Proposition 4.11

Proposition 4.11. For every $\alpha$, we have $\alpha$ $\notin$ $V_{\alpha}$, but $\alpha$ $\in$ $V_{\alpha+1}$. In other words, $rank$($\alpha$)= $\alpha$.

What Proposition 4.11 shows is that (if one views the cumulative hierarchy as being generated rather than preexisting) the $V_{\alpha}$ (where $\alpha$ is either a successor or limit ordinal) which has been currently generated should be deemed a proper class (since no class $V_{\alpha+1}$ for which $V_{\alpha}$ $\in$ $V_{\alpha+1}$ has yet been generated) until $V_{\alpha+1}$ has been generated, and only then should be deemed a set (since when $V_{\alpha+1}$ has been generated, $V_{\alpha}$ $\in$ $V_{\alpha+1}$). Propositon 4.11 also shows that Extensionality, Foundation, Union, Separation schema, and choice should be classified as axioms that hold in $V_{\alpha}$ iff $\alpha$ $\gt$ 0 (since as Coskey rightly points out for the Empty Set axiom at $V_0$, "nothing postulating the existence of a set holds", and at $V_0$, $\emptyset$ = $V_0$ is a proper class). The reason that the $V_{\alpha}$ should be deemed proper classes at the point of generation is because of the following definition (courtesy of Olivier Esser from his paper, "On the Consistency of a Positive Theory", *Mathematical Logic Quarterly*, Vol. 45, No. 1 (1999), pp. 105-116):

"$a$ is a set" $\Leftrightarrow$ $\exists$$y$( "$y$ is a class" $\land$ $a$ $\in$ $y$)

Note that by setting Extensionality, Foundation, Union, Axiom Schema of Separation, and Choice to hold at $V_{\alpha}$ iff $\alpha$ $\gt$ 0, one has the Cumulative Hierarchy definable in ZFC (the Wikipedia article, "Von Neumann universe", claims that Judith Roitman, in her book, *Introduction to Modern Set Theory* [2011 editon,pg 79], states without reference that the realization that the Axiom of Foundation is equivalent to the equality of the universe of $ZF$ sets to the Cumulative Hierarchy is due to Von Neumann). Note also, however, that the existence of the Cumulative Hierarchy does not imply what ordinals and cardinals actually exist. For example, if one assumes the existence of a strongly inaccessible cardinal $\kappa$, it can be shown that $V_{\kappa}$ is a model of ZFC. However, by Theorem 4.11, $\kappa$ $\notin$ $V_{\kappa}$ (but by Theorem 4.11, $\kappa$ $\in$ $V_{\kappa + 1}$ and the aforementioned Wikipedia entry states that $($ $V_{\kappa}$, $\in$, $V_{\kappa + 1}$ $)$ when $\kappa$ is strongly inaccessible is a model of $MK$ set theory). This shows that $\kappa$ is a proper class relative to $V_{\kappa}$ and a set relative to $V_{\kappa+1}$. However, $V_{\kappa + 1}$ is now a proper class and one needs to justify the move from $V_{\kappa}$ to $\mathcal P$($V_{\kappa}$) = $V_{\kappa + 1}$ since $V_{\kappa}$ is a model of $ZFC$ (if one sets $\kappa$ for $V_{\kappa}$ to be the least inaccessible, $V_{\kappa}$ is a model of $ZFC$ + "There is no inaccessible cardinal" and such a $V_{\kappa}$ can be considered to be the universe of all sets $V$) and $\mathcal P$ is only defined relative to $\kappa$ (in fact, since $\kappa$ is a strong limit, one has that for every $\beta$ $\lt$ $\kappa$, $2^{\beta}$ $\lt$ $\kappa$ so that the power set operation definable in $V_{\kappa}$ is not the power set operation needed to define $\mathcal P$($V_{\kappa}$) = $V_{\kappa + 1}$). As has been mentined before, if one sets $\kappa$ for $V_{\kappa}$ to be the least inaccessible cardinal (call it $\kappa_0$ for $V_{\kappa_0}$), $V_{\kappa_0}$ is a class model of $ZFC$ + "There are no inaccessible cardinals" and can be rightly called $V$.

But what of $V$, the proper class of all sets? By the Burali-Forti paradox, $V$ must always be a proper class (which means that one cuts off the cumulative hierarchy at $\kappa_0$ or at some inaccessible $\kappa$). However, one could possibly imagine $\mathcal P$($V$) (the class of all subclasses of $V$) so that $V$ $\in$ $\mathcal P$($V$). In order to do this, however, one must have a set/class theory like Ackermann set theory. Fotunately, one has the Levy theory (or the Levy-Vaught interpretation of Ackermann set theory, as it is called) which is $ZFC$ + $V_{\kappa}$ $\prec$ $V$ + $\kappa$ inaccessible. Joel David Hamkins, in his answer to my mathoverflow question, " Forcing in Ackermann set theory", has this to say regarding the Levy theory:

Ackermann set theory is a version of set theory where one views the set-theoretic hierarchy as continuing far past the the construction of sets, into the construction of classes, classes of classes and so on. In the Ackermann theory, it is as though one is building the full $V_{\alpha}$ hierarchy, but then part-way through one finds a particularly robust $V_{\delta}$ and declares its elements to be real "sets", with everything above $\delta$ declared "classes". (Critics would say that Ackermann's sets are only some of the sets, since his classesbehave fundamentally like sets.)

As Francois [Dorais--my comment] points out in his comments, however, the Ackermann theory seems to provide less than what one may want in the realm of classes, a weakness in the theory that is addressed by its natural strengthenings to various set theories in a more $ZFC$-like context. Namely, the Levy theory is $ZFC$ + $V_{\delta}$ $\prec$ $V$ + $\delta$ is inaccessible, where $V_{\delta}$ $\prec$ $V$ is the scheme asserting $\forall$$x$ $\in$$V_{\delta}$ ($\varphi$($x$) $\Leftrightarrow$ $\varphi({x})^{V_{\delta}}$), which is expressible in the language of set theory augmented with the constant symbol $\delta$. The set $V_{\delta}$ here plays the part of $V$ in Ackermann's theory, and so every model of the Levy scheme is a model of Ackermann set theory, if one regards the elements of $V_{\delta}$ as the official "sets" and the sets above $V_{\delta}$ as the "classes". But the Levy theory asserts more than Ackermann, because not only is the collection of sets existing as an object in the theory, but also it is an elementary substructure of the full universe. In addition, the levy theory has a fuller treatment of classes, making them more set-like, in that the larger universeabove $\delta$, which correspond to the classes of the Ackermann theory, actually satisfy $ZFC$.

For my part (at least), I would like to set $\delta$ for $V_{\delta}$ as equal to $\kappa_0$, the least inaccessible cardinal. That way, the natural models $V_{\kappa}$ of $ZFC$ + "There are no inaccessible cardinals, $ZFC$ + "There exists an inaccessible cardinal", $ZFC$ +There are two inaccessible cardinals", etc., all correspond to proper classes (and by Godel's Second Incompleteness Theorem are all separate, distinct theories). As can be easily seen (I think, since Esser's criterion for distinguishing sets from classes holds for the Levy theory), using the Levy theory as a metatheory (or, perhaps better, a schema for generating models for increasingly more comprehensive metatheories) dissolves Elie's conundrum.

How? Well, consider that the metatheories set the context in which a universe of sets can be defined; in fact, the metatheories define the models, a.k.a the "pre-existing universe of sets and ordinals" (as user21820 states in his answer) which are necessary for the cumulative hierarchy to exist and function, but only in a virtual sense (for example, the metatheory defines the power set operation and the union operation by which the cumulative hierarchy can be defined, and the initial set, the empty set, on which the power set operation can operate). It is, however, the cumulative iterative construction which actually forms the model $V_{\kappa}$. However, what it forms is an internal cumulative hierarchy cut off at the inaccessible cardinal of your choice (e.g., $V_{\kappa_0}$ without the existence of $\kappa_0$, which is, of course, the natural model of $ZFC$ + "There are no inaccessible cardinals"). In this aforementioned example, $V_{\kappa_0}$ = $V$, so $V_{\kappa_0}$ is a proper class subject to the Burali-Forti paradox. Note, though, that I have kept (for reference purposes only) the subscript $\kappa_0$ for $V_{\kappa_0}$ (though in actuality I should properly refer to $V_{\kappa_0}$ in this case as $V$). By keeping $\kappa_0$ as an index for $V_{\kappa_0}$, one implicitly assumes that there exists a model $V_{\kappa_1}$ where $\kappa_0$ $\lt$ $\kappa_1$ and $\kappa_1$ is also inaccessible in which $\kappa_0$ is a set (and therefore a set model of $ZFC$ + "There is no inaccessible cardinal"), or (in the Levy theory) a proper class. In fact, one has the following rule:

$R1$. If one assumes the existence of an inaccessible $\kappa_{i}$ where $i$ $\in$ $Ord$, one implicitly assumes the existence of the inaccessible $\kappa_{i + 1}$.

This follows from the following theorems found in Jeroen Hekking's Bachelor's Thesis, "Natural Models, Second-order Logic & Categoricity in Set Theory":

Theorem 3.1.2. A cardinal $\kappa$ is inaccessible iff $\mathcal M_{\kappa}$ $\vDash$ $ZFC^2$.

Proposition 3.1.1. For all Henkin models $<$ $\mathcal M$, $\mathcal G$ $>$ in $\sigma$ satisfying $ZFC^2$ we have $\mathcal M$ $\vDash$ $ZFC$ [where $\sigma$ is the signature consisting of the symbol $\in$--my comment. A Henkin model is a pair $<$ $\mathcal M$, $\mathcal G$ $>$ with $\mathcal M$ a first-order model and $\mathcal G$ a collection of relations and functions satisfying second-order choice and second-order comprehension. The latter determines the range of our second-order variables and contains, by comprehension, all second-order definable functions and relations on $\mathcal M$. If we take all relations and functions on $\mathcal M$ we get a *full* second-order model, which will be denoted simply by $\mathcal M$ (this is a direct quote of Hekking's definition of Henkin model, and his second-order deductive system consists of Second-order Choice, Second-order Comprehension, and "some straight-forward rules for manipulating quantifiers and logical connectives as given in Shapiro's *Foundations without foundationalism; A case for second-order logic*, (1991), pg. 66"--my comment also). Note also that full models are Henkin models as well].

Theorem 3.2.2. Let $\mathcal M$ be a full model in $\sigma$ of $ZFC^2$. Then $\mathcal M$ is uniquely isomorphic to the natural model $\mathcal M_{\kappa}$ with $\kappa$ $\cong$ $O^{\mathcal M}$ [where $O^{\mathcal M}$ is the class of ordinals of $\mathcal M$. $\kappa$ is defined as the *ordinal height* of $\mathcal M$--my comment].

Corollary 3.2.3 (External Semi-categoricity). The theory $ZFC^2$ is semi-categorical with respect to full models. That is, for any two models $\mathcal M$, $\mathcal N$ $\vDash$ $ZFC^2$ in $\sigma$ we can uniquely embed $\mathcal M$ as an initial segment into $\mathcal N$ [that is, if the ordinal height of $\mathcal M$ is less than the ordinal height of $\mathcal N$--my comment], or the other way around.

(Here is the proof: By 3.2.2. and 3.1.2., $O^{\mathcal M}$ $\cong$ $\kappa$ so that $O^{\mathcal M}$ is inaccessible (so for the least inaccessible cardinal $O^{\mathcal M_0}$ $\cong$ $\kappa_0$, where $\mathcal M_0$ $\vDash$ $ZFC$ + "There are no inaccessible cardinals, $\kappa_0$ must exist as a set or as a proper class in the Levy theory in order that Replacement holds in $\mathcal M_0$). Since $rank$($O^{\mathcal M}$)= $\kappa$, $O^{\mathcal M}$ $\in$ $V_{\kappa + 1}$, so for $V_{\kappa + 1}$ = $\mathcal P$($V_ {\kappa}$), one of necessity needs to assume (because $\mathcal P$($x$) can only be defined at limit ordinals) that there exists a larger inaccessible cardinal $\kappa^{'}$ (i.e. $\kappa$ $\lt$ $\kappa^{'}$ so that by Corollary 3.2.3, $\mathcal M_{\kappa}$ is an initial segment of $\mathcal M_{\kappa^{'}}$, i.e., $V_{\kappa}$ is an initial segment of $V_{\kappa^{'}}$ and Replacement will hold for $V_{\kappa^{'}}$). But then by the previous argument, there must of necessity exist a larger inaccessible cardinal $\kappa^{''}$, etc. in order to keep the power set operator defined for the entire cumulative hierarchy (since the $\kappa$ for the $V_{\kappa}$ for $\kappa$ inaccessible are ordinals, they themselves can be well-ordered and can be indexed by ordinals,hence the theorem is proved--note again that for the least inaccessible cardinal $\kappa_0$, it must exist for the simple reason that if $\kappa_0$ is inaccessible then Replacement holds for $V_{\kappa_0}$, which then is a model of (by Proposition 3.1.1) $ZFC$ + "There are no inaccessible cardinals"--note also that if one sets $V_{\kappa_0}$ to be the class of all sets $V$ and any larger inaccessible $V_{\kappa^{'}}$ to be 'proper classes', then one has a model of the Levy theory since Corollary 3.2.3 shows that $V_{\kappa_0}$ $\prec$ $V$). Since it is known that the largest of the large cardinals can be expresed in terms of elementary embeddings, it behooves me to find out what elementary embeddings represent strongly inaccessible cardinals.

In the Cantor's Attic entry, "Elementary Embedding", one finds the following under the subheading "Use in Large Cardinal Axioms":

There are two ways of making the critical point as large as possible:

- Making $\mathcal M$ as large as possible, much larger than $\mathcal N$ (meaning that a "large" class can be embedded into a smaller class)
- Making $\mathcal M$ and $\mathcal N$ more similar(for example, $\mathcal M$ = $\mathcal N$ yet $j$ is nontrivial)

Using the first method, one can simply take $\mathcal M$ = $V$ (the universe of all sets), and the resulting critical point is always a measurable cardinal, a very strong type of cardinal, e.g. the first measurable is larger than infinitely many weakly compact cardinals (and much more).

Using the second method, one can take, say, $\mathcal M$ = $\mathcal N$= $L$, i.e. create [a non-trivial--my comment] embedding $j$: $L$ $\rightarrow$ $L$, whose existence has very important consequences, such as the existence of $0^{\sharp}$ (and thus $V$ $\neq$ $L$) and implies that every or dinal that is an uncountable cardinal in $V$ is strongly inaccessible in $L$. By taking $\mathcal M$ = $\mathcal N$ = $V_{\lambda}$, i.e. a rank of the cumulative hierarchy, one obtains he very powerful rank-into-rank axioms, which sit near the very top of the large cardinal hierarchy. However, this second method has its limits, as shown by Kunen, as he showed that $\mathcal M$ = $\mathcal N$ = $V$ leads to an inconsistency with the axiom of choice, a theorem now known as the Kunen inconsistency. He also showed that a natural strengthening of the rank-into-rank axioms, $\mathcal M$ = $\mathcal N$ = $V_{\lambda+2}$ for some $\lambda$ $\in$ $Ord$, was inconsistent with the $AC$.

Most large cardinal axioms in between measurables and rank-into-rank axioms are obtained by mixing those two methods: one usually sets $\mathcal M$ = $V$ then requires $\mathcal N$ to satisfy strong closure properties to make it "larger", i.e. closer to $V$ (that is, to $\mathcal M$). For example, $j$: $V$ $\rightarrow$ $\mathcal N$ is nontrivial with critical point $\kappa$ and the cumulative hierarchy rank $V_{j(\kappa)}$ is a subset of $\mathcal N$ then $\kappa$ is superstrong; if $\mathcal N$ contains all sequences of elements of $\mathcal N$ of length $\lambda$ for some $\lambda$ $\gt$ $\kappa$ then $\kappa$ is $\lambda$-supercompact, and so on.

The existence of a nontrivial elementary embedding $j$: $\mathcal M$ $\rightarrow$ $\mathcal N$ *that is definable in* $\mathcal M$ implies that the critical point $\kappa$ of $j$ is measurable in $\mathcal M$ (not necessarily in $V$). Every measurable ordinal is weakly compact and (strongly) inaccessible therefore its existence in any model is beyond $ZFC$, meaning that $ZFC$ cannot prove that such a cardinal exists [Note that at least according to Noah Schweber, "...my impression is that when we say '$\kappa$ is ... $I_0$' we mean '$\kappa$ is *the critical point of* an $I_0$ embedding,' and this is always inaccessible (*and measurable, and etc.*)...Regardless, even if you refer to the *rank* level of the embedding_, the property 'is the critical point of...' is equiconsistent and *does* define an inaccessible...." so (if Noah is correct) that even the largest known large cardinal axiom fits into the pattern $j$: $V$ $\rightarrow$ $M$ and also into the pattern $\mathcal Z$ = {$V_{\kappa}$ | $\kappa$ inaccessible} satisfying the Universe Axiom for $V_{\kappa}$ $\vDash$ $ZFC^2$ needing, for $V_{\kappa}$ to be a set (or a proper class according to the Levy theory), even larger inaccesible $\kappa$'s above it. This suggests, by the theorems for $ZFC^2$ listed above, that $\mathcal Z$ = {$V_{\kappa}$| $\kappa$ inaccessible} is the 'universe $V$' of $ZFC^2$, which, by the above argument, can never be completed. (Note also that if Choice fails above $I0$ in such fashion that above $I0$, $ZF$ + "There exists a Reinhardt cardinal" holds, then it can be shown in $ZF$ that the Reinhardt cardinal $\kappa_{Reinhardt}$ is inaccessible [according to Prof. Hamkins in his answer to Tim Campion's mathoverflow question, "Does $Con$($ZF$ + Reinhardt) really imply $Con$($ZFC$ + $I0$)?"] so that the 'choiceless cardinals' can be elements of $\mathcal Z$ = $V^2$ $\vDash$ $ZF^2$ (and by Proposition 3.1.1 of Hekking, of $ V$ $\vDash$ $ZF$) as well, lending credence to the view that $V^2$ (and therefore, by Proposition 3.1.1, $V$) is not fixed according to height.).

exactlyis $\Bbb N$? $\endgroup$`$...$`

for emphasising text when you write, that's what markdown is for (or`\emph`

in LaTeX). $\endgroup$3more comments