번역. Kafka Protocol Guide

잡다한 자료 2021. 5. 5. 00:29

내가 읽기 위해서, 내맘대로 번역한 글.

KAFKA PROTOCOL GUIDE

This document covers the wire protocol implemented in Kafka. It is meant to give a readable guide to the protocol that covers the available requests, their binary format, and the proper way to make use of them to implement a client. This document assumes you understand the basic design and terminology described here
이 문서는 카프카에서 구현된 유선 프로토콜을 다룬다.
이것은 클라이언트를 구현하기 위해서 유효한 요청과 바이너리포맷 그리고 올바른 사용법을 포함하는 프로토콜에 대한 읽기 쉬운 가이드를 제공한다.
이 문서는 너가 여기에 기술된 기본 설계와 전문용어를 이해하고 있다고 간주한다.
(이석우 추가 : 여기서 유선(wire)의 의미가 뭘까? 진짜 통신선을 의미하는걸까?)

Preliminaries

Network

Kafka uses a binary protocol over TCP. The protocol defines all APIs as request response message pairs. All messages are size delimited and are made up of the following primitive types.
카프카는 TCP를 통한 바이너리 프로토콜을 사용한다.
프로토콜은 모든 API들을 요청과 응답 메세지 쌍으로 정의한다.
모든 메세지는 크기로 구분되고 다음에 나오는 기본 타입으로 구성된다.
(이석우 추가 : 모든 메세지는 정해진 크기가 있다는 뜻인거 같음)

The client initiates a socket connection and then writes a sequence of request messages and reads back the corresponding response message. No handshake is required on connection or disconnection. TCP is happier if you maintain persistent connections used for many requests to amortize the cost of the. TCP handshake, but beyond this penalty connecting is pretty cheap.
클라이언트는 소켓 연결을 초기화 하고, 요청 메세지를 순서대로 쓰고, 해당 응답 메세지를 읽는다.
연결과 연결종료시에 핸드세이크는 필요없다.
TCP 핸드세이크 비용을 아끼기 위해서 많은 요청에 사용할 지속적인 연결들을 관리한다면 TCP는 만족스럽지만,
초과되는 연결의 불이익은 매우 적다. (이석우 추가 : '연결시 마다 발생하는 TCP handshake 는 무시할만 하다' 라는 뜻인가???)

The client will likely need to maintain a connection to multiple brokers, as data is partitioned and the clients will need to talk to the server that has their data. However it should not generally be necessary to maintain multiple connections to a single broker from a single client instance (i.e. connection pooling).
데이타는 여러 서버에 분할 저장되어 있고, 클라이언트는 그 서버들중에서 필요한 데이타를 가지고 있는 서버와 통신해야 하기 때문에
클라이언트는 여러 브로커에 대한 연결을 관리해야 한다.
그러나 하나의 클라이언트가 하나의 브로커에 대해서 다중 연결을 유지하는것은 일반적으로 필요하지 않다.(예 : 연결 풀링)

The server guarantees that on a single TCP connection, requests will be processed in the order they are sent and responses will return in that order as well. The broker's request processing allows only a single in-flight request per connection in order to guarantee this ordering. Note that clients can (and ideally should) use non-blocking IO to implement request pipelining and achieve higher throughput. i.e., clients can send requests even while awaiting responses for preceding requests since the outstanding requests will be buffered in the underlying OS socket buffer. All requests are initiated by the client, and result in a corresponding response message from the server except where noted.
서버는 하나의 TCP 연결안에서, 요청은 보내진 순서대로 처리되고, 응답도 같은 순서로 반환됨을 보장한다.
브로커는 순서를 보장하기 위하여 연결당 진행중인 요청을 하나씩만 허용한다.
클라이언트는 요청 파이프라인을 구현하거나 높은 처리량을 달성하기 위해 비차단IO를 사용할수 있다.
클라이언트는 이전 요청에 대한 응답을 기다리는 중에도 또 요청을 보낼수 있다.
왜냐면 아직 처리되지 않은 요청은 기본적인 OS 소켓 버퍼에 저장되니까.
모든 요청은 클라이언트가 초기화하고, 대응하는 응답 메세지는 서버가 생성한다 (명시된 경우를 제외하고)

The server has a configurable maximum limit on request size and any request that exceeds this limit will result in the socket being disconnected.
서버는 요청 크기에 대한 최대값 제한이 설정되어 있고, 이 제한을 넘는 모든 요청은 소켓 연결이 끊어진다.

Partitioning and bootstrapping

Kafka is a partitioned system so not all servers have the complete data set. Instead recall that topics are split into a pre-defined number of partitions, P, and each partition is replicated with some replication factor, N. Topic partitions themselves are just ordered "commit logs" numbered 0, 1, ..., P-1.
카프카는 분산저장 시스템이기 때문에 모든 서버가 완벽한 데이타를 가지고 있지는 않다.
대신 토픽은 미리정의된 숫자의 파티션(P)에 분할되고, 각 파티션은 복제 요소(N)만큼 복제된다.
토픽 파티션 자체는 0부터 P-1까지 번호가 부여된 "commit logs"로 정렬된다.

All systems of this nature have the question of how a particular piece of data is assigned to a particular partition. Kafka clients directly control this assignment, the brokers themselves enforce no particular semantics of which messages should be published to a particular partition. Rather, to publish messages the client directly addresses messages to a particular partition, and when fetching messages, fetches from a particular partition. If two clients want to use the same partitioning scheme they must use the same method to compute the mapping of key to partition.
이런 성격의 모든 시스템은 특정 데이타 조각이 특정 파티션에 어떻게 할당되는지에 대한 궁금증을 가진다.
카프카 클라이언트가 직접 이 할당을 조절하고, 브로커들 자체는 메세지가 어떤 파티션에 발행되어야 하는지에 대해 강제하지 않는다.
대신 메세지를 발행하기 위해 클라이언트가 직접 특정 파티션을 지정하고,
메세지를 가져올때는 특정 파티션으로 부터 가져온다.
만약 두개의 클라이언트가 같은 파티션 스키마를 사용하려면, 파티션에 대한 키 매핑 계산을 같은 방법으로 해야 한다.

These requests to publish or fetch data must be sent to the broker that is currently acting as the leader for a given partition. This condition is enforced by the broker, so a request for a particular partition to the wrong broker will result in an the NotLeaderForPartition error code (described below).
데이타를 발행하거나 가져오는 요청들은 반드시 주어진 파티션에 대해 현재 리더 역활을 하는 브로커에 보내져야 한다.
이 조건은 브로커에 의해서 강제되는데,
특정 파티션에 대한 요청이 잘못된 브로커에게 갔을때는 NotLeaderForPartition 에러가 발생한다 (아래에 설명함)

How can the client find out which topics exist, what partitions they have, and which brokers currently host those partitions so that it can direct its requests to the right hosts? This information is dynamic, so you can't just configure each client with some static mapping file. Instead all Kafka brokers can answer a metadata request that describes the current state of the cluster: what topics there are, which partitions those topics have, which broker is the leader for those partitions, and the host and port information for these brokers.
클라이언트가 어떤 토픽들이 존해하는지, 어떤 파티션들이 있는지, 어떤 브로커가 현재 그 파티션들의 주인인지 파악하여 올바른 호스트로 보낼수 있는지 어떻게 알수 있을까?
이 정보는 동적이라서, 정적인 매핑파일을 통해서 각 클라이언트를 설정할수 없다.
대신 모든 카프카 브로커들은 현재 클러스터의 상태를 기술하는 메타데이타 요청에 답할수 있다.
메타데이타 : 어떤 토픽들이 있는지, 어떤 파티션이 그 토픽들을 가지고 있는지, 어떤 브로커가 그 파티션들의 리더인지, 그 브로커들에 대한. 주소와 포트정보.

In other words, the client needs to somehow find one broker and that broker will tell the client about all the other brokers that exist and what partitions they host. This first broker may itself go down so the best practice for a client implementation is to take a list of two or three URLs to bootstrap from. The user can then choose to use a load balancer or just statically configure two or three of their Kafka hosts in the clients.
다른말로 하면, 클라이언트는 어떻게든지 하나의 브로커를 찾아야 하고, 그 브로커는 존재하는 모든 다른 브로커와 그 브로커들이 호스트하는 파티션들을 클라이언트에게 알려줘야 한다.
첫번째 브로커 자체가 다운될수 있으므로, 가장 좋은 클라이언트 구현은 부트스트랩할 2개 또는 3개의 URL 목록을 가져오는 것이다.
그런 다음 사용자는 로드 밸런서를 사용하도록 선택하거나 클라이언트에서 2~3개의 카프카 호스트를 정적 구성할수 있다.

The client does not need to keep polling to see if the cluster has changed; it can fetch metadata once when it is instantiated cache that metadata until it receives an error indicating that the metadata is out of date. This error can come in two forms: (1) a socket error indicating the client cannot communicate with a particular broker, (2) an error code in the response to a request indicating that this broker no longer hosts the partition for which data was requested.
클라이언트는 클러스터가 변했는지 알기위해 폴링을 유지할 필요가 없다.
클라이언트가 생성될때 메타데이타를 가져와서 메타데이타가 만료되었다는 에러를 받기전까지 자체적으로 저장하고 있으면 된다.
이 에러는 두가지 형태로 온다.
(1) 클라이언트가 특정 브로커와 통신할수 없음을 나타내는 소켓 에러.
(2) 요청된 데이타를 위한 파티션을 이 브로커가 더이상 호스트하지 않는다는걸 나타내는 응답 에러.

Cycle through a list of "bootstrap" Kafka URLs until we find one we can connect to. Fetch cluster metadata.
부트스트랩 카프카 URL 목록을 차례대로 돌아가면서 연결할수 있는걸 찾는다. 그리고 메타데이타를 가져온다.
Process fetch or produce requests, directing them to the appropriate broker based on the topic/partitions they send to or fetch from.
조회요청 또는 생성요청을 만들고, 토픽과 파티션을 기반으로 보내거나 받을수 있는 적당한 브로커에게 요청을 보낸다.
If we get an appropriate error, refresh the metadata and try again.
만약 적절한 에러가 발생하면, 메타데이타를 새로 받은후 다시 시도한다.

Partitioning Strategies

As mentioned above the assignment of messages to partitions is something the producing client controls. That said, how should this functionality be exposed to the end-user?
위에서 언급했듯이 메세지를 파티션에 할당하는것은 무엇인가를 생산하는 클라이언트가 제어하는것이다.
최종 사용자에게 이 기능을 어떻게 노출해야 하는건인가?

Partitioning really serves two purposes in Kafka:
카프카에서 파티셔닝은 실제로 두가지 용도로 사용된다.

It balances data and request load over brokers
데이타 균형과 브로커들 사이에서 요청 로드
It serves as a way to divvy up processing among consumer processes while allowing local state and preserving order within the partition. We call this semantic partitioning.
로컬상태를 허용하고 파티션안에서 순서를 보장하면서 소비 프로세스간의 처리를 분해하는 방법으로 사용된다.
이를 시멘틱 파티셔닝이라고 부른다.

For a given use case you may care about only one of these or both.
너는 주어진 사용 사례들에 관심을 가질수 있다.

To accomplish simple load balancing a simple approach would be for the client to just round robin requests over all brokers. Another alternative, in an environment where there are many more producers than brokers, would be to have each client chose a single partition at random and publish to that. This later strategy will result in far fewer TCP connections.
단순한 로드 밸런싱을 수행하려면 클라이언트가 모든 브로커에게 차례차례 요청하는 간단한 방법이 있다.
브로커보다 생산자가 더 많은 환경에서, 다른 대안은, 클라이언트가 랜덤하게 파티션 하나를 선택하고, 거기에 발행하는것이다.
이 later 전략은 훨씬 적은 TCP 연결을 생성한다.

Semantic partitioning means using some key in the message to assign messages to partitions. For example if you were processing a click message stream you might want to partition the stream by the user id so that all data for a particular user would go to a single consumer. To accomplish this the client can take a key associated with the message and use some hash of this key to choose the partition to which to deliver the message.
시멘틱 파티셔닝은 메세지안에 있는 어떤 키를 이용하여 파티션에 메세지를 할당하는것을 뜻한다.
예를 들어, 너가 클릭 메세지 흐름을 처리한다면 너는 사용자 id별로 스트림을 파티션하기 원할것이고,
그러면 특정 사용자의 모든 데이타는 단일 소비자에게 갈것이다.
이것을 수행하기 위해서 클라이언트는 메세지와 연관된 키의 일부 해쉬를 이용해서 메세지를 전달할 파티션을 선택할수 있다.

Batching

Our APIs encourage batching small things together for efficiency. We have found this is a very significant performance win. Both our API to send messages and our API to fetch messages always work with a sequence of messages not a single message to encourage this. A clever client can make use of this and support an "asynchronous" mode in which it batches together messages sent individually and sends them in larger clumps. We go even further with this and allow the batching across multiple topics and partitions, so a produce request may contain data to append to many partitions and a fetch request may pull data from many partitions all at once.
우리 API들은 효율을 위하여 작은 일들은 일괄처리하도록 권장한다. 이것이 매우 중요한 성능 향상임을 발견했다.
이것을 권장하기 위하여 메세지를 보내는 API와 메세지를 받는 API 둘다 단일 메세지가 아닌 일련의 메세지들과 함께 동작한다.
영리한 클라이언트는 이것을 이용하고 비동기를 지원하여 개별적으로 보내는 메세지들을 함께 일괄처리하고 큰 덩어리로 보낸다.
심지어 더 나아가 다중 토픽들간에 일괄처리를 허용하고, 생산 요청은 여러 파티션에 추가될 데이타를 포함하고,
조회 요청은 여러 파티션에서 데이타를 한번에 가져올수 있다.

The client implementer can choose to ignore this and send everything one at a time if they like.
원한다면 클라이언트를 구현하는 사람은 이것을 무시하고 한번에 하나씩 모든것을 보낼수 있다.

Compatibility

Kafka has a "bidirectional" client compatibility policy. In other words, new clients can talk to old servers, and old clients can talk to new servers. This allows users to upgrade either clients or servers without experiencing any downtime.
카프카는 양방향 클라이언트 호환성 정책을 가지고 있다.
다른말로, 새로운 클라이언트는 오래된 서버와 통신할수 있고, 오래된 클라이언트는 새로운 서버와 통신할수 있다.
이것은 다운타임 경험없이 클라이언트 또는 서버를 업그레이할수 있도록 해준다.

Since the Kafka protocol has changed over time, clients and servers need to agree on the schema of the message that they are sending over the wire. This is done through API versioning.
카프카 프로토콜은 시간이 지나면서 변경되었기 때문에, 클라이언트와 서버는 유선으로 보내는 메세지의 스키마에 동의해야 한다.
이것은 API 버젼닝을 통해 이루어진다.

Before each request is sent, the client sends the API key and the API version. These two 16-bit numbers, when taken together, uniquely identify the schema of the message to follow.
각 요청이 보내지기 전에, 클라이언트는 API 키와 API 버젼을 보낸다.
이 두개의 16-bit 숫자를 함께 이용하여, 메세지가 따라야할 고유한 스키마를 식별한다.

The intention is that clients will support a range of API versions. When communicating with a particular broker, a given client should use the highest API version supported by both and indicate this version in their requests.
클라이언트가 지원하려는 API 버젼들의 범위를 의도한다.
특정 브로커와 통신할때, 클라이언트는 양쪽에서 모두 지원하는 가장 높은 API 버젼을 사용해야 하고,
이 버젼을 그들의 요청에 표시해야 한다.

The server will reject requests with a version it does not support, and will always respond to the client with exactly the protocol format it expects based on the version it included in its request. The intended upgrade path is that new features would first be rolled out on the server (with the older clients not making use of them) and then as newer clients are deployed these new features would gradually be taken advantage of.
서버는 지원하지 않는 버젼의 요청일때는 거절할 것이고,
요청에 포함된 버젼을 기반으로 예상되는 정확한 프로토콜 포맷을 클라이언트에게 항상 응답할것이다.
업그레이드를 하는 방법은 먼저 새로운 특징을 서버에 출시하고 (새로운 특징을 사용하지 않는 클라이언트들이 있는 상태로)
그러고나서 새로운 클라이언트들이 배포되어 새로운 특징들을 점차적으로 활용하는것이다.

Note that KIP-482 tagged fields can be added to a request without incrementing the version number. This offers an additional way of evolving the message schema without breaking compatibility. Tagged fields do not take up any space when the field is not set. Therefore, if a field is rarely used, it is more efficient to make it a tagged field than to put it in the mandatory schema. However, tagged fields are ignored by recipients that don't know about them, which could pose a challenge if this is not the behavior that the sender wants. In such cases, a version bump may be more appropriate.
태그필드는 버젼 번호를 높이지 않고 요청에 추가될수 있다.
이것은 호환성을 깨지않고 메세지 스키마를 발전시킬수 있는 추가적인 방법을 제공한다.
태그필드는 그 필드가 비어 있을때는 추가적인 공간을 차지하지 않는다.
그러므로 어떤 필드가 드물게 사용된다면, 이것을 필수 스키마에 넣는것보다는 태그필드로 만드는게 더 효율적이다.
그러나 태그필드는 자신에 대해 알지 못하는 수신자들에게는 무시되므로,
이것(수신자에 의해 무시된다는 점)이 발신자가 원하는 동작이 아닌경우 문제가 될수 있다.
이런경우에는 버젼 범프(이게 무슨 말이야?)가 더 적절하다

Retrieving Supported API versions

In order to work against multiple broker versions, clients need to know what versions of various APIs a broker supports. The broker exposes this information since 0.10.0.0 as described in KIP-35. Clients should use the supported API versions information to choose the highest API version supported by both client and broker. If no such version exists, an error should be reported to the user.
여러 브로커 버젼들에 대해서 작업하려면, 클라이언트는 브로커가 지원하는 다양한 API들의 버젼을 알아야만 한다.
KIP-35에 설명되었듯이 브로커 버젼 0.10.0.0 부터 이 정보를 노출한다.
클라이언트는 지원되는 API 버젼들의 정보를 사용해서 클라이언트와 브로커가 둘다 지원하는 가장높은 API 버젼을 선택해야 한다.
버젼이 없으면, 사용자에게 에러를 보고해야 한다.

The following sequence may be used by a client to obtain supported API versions from a broker.
클라이언트는 다음 순서를 사용해서 브로커로 부터 지원되는 API 버젼을 얻을수 있다.

Client sends ApiVersionsRequest to a broker after connection has been established with the broker. If SSL is enabled, this happens after SSL connection has been established.
클라이언트는 브로커와 연결이 확립된 이후 ApiVersionsRequest 를 브로커에게 보낸다.
만약 SSL이 활성화 되어 있으면, 이것은 SSL 연결이 확립된 이후 발생한다.
On receiving ApiVersionsRequest, a broker returns its full list of supported ApiKeys and versions regardless of current authentication state (e.g., before SASL authentication on an SASL listener, do note that no Kafka protocol requests may take place on an SSL listener before the SSL handshake is finished). If this is considered to leak information about the broker version a workaround is to use SSL with client authentication which is performed at an earlier stage of the connection where the ApiVersionRequest is not available. Also, note that broker versions older than 0.10.0.0 do not support this API and will either ignore the request or close connection in response to the request.
ApiVersionsRequest를 받으면, 현재 인증상태와는 상관없이 브로커는 지원하는 ApiKey들과 버젼들의 모든 목록을 반환한다.
(예, SASL 리스너의 SASL 인증전에, SSL 핸드세이크가 완료되기 전에는 SSL 리스너에서 카프카 프로토콜 요청이 발생하지 않는다는걸 유의해라) (이석우 추가 : 음... 무슨 말인지 모르겠다)
만약 이것이 브로커 버젼에 대한 정보 유출로 간주된다면,
ApiVersionRequest를 사용할수 없는 연결의 이전 단계에서 수행되는 클라이언트 인증에 SSL을 사용하는것이 해결법이다.
또한 브로커 버젼이 0.10.0.0 보다 오래되었다면 이 API는 지원하지 않고
그 요청을 무시하거나 그 요청에 대한 응답에서 연결을 끊을것이다.
If multiple versions of an API are supported by broker and client, clients are recommended to use the latest version supported by the broker and itself.
만약 브로커와 클라이언트가 하나의 API에 여러 버젼을 지원한다면, 클라이언트는 브로커와 자신이 지원하는 가장 최신 버젼을 사용는것이 권장된다.
Deprecation of a protocol version is done by marking an API version as deprecated in the protocol documentation.
프로토콜 버젼의 지원중단은 해당 프로토콜 문서에 API 버젼이 지원중단되었다고 표시된다.
Supported API versions obtained from a broker are only valid for the connection on which that information is obtained. In the event of disconnection, the client should obtain the information from the broker again, as the broker might have been upgraded/downgraded in the mean time.
브로커에서 얻은 지원 API 버젼들은 그 정보를 얻은 연결에서만 유효하다.
연결이 끊어지면, 브로커가 그동안에 업그레이드나 다운그레이드가 되었을수 있기 때문에 클라이언트는 브로커에서 정보를 다시 받아야 한다.

SASL Authentication Sequence

The following sequence is used for SASL authentication:
SASL 인증에 다음 절자가 사용된다.

Kafka ApiVersionsRequest may be sent by the client to obtain the version ranges of requests supported by the broker. This is optional.
클라이언트는 카프카 ApiVersionsRequest 를 보내서 브로커가 지원하는 요청의 버젼 범위를 얻는다.
이것은 선택사항이다.
Kafka SaslHandshakeRequest containing the SASL mechanism for authentication is sent by the client. If the requested mechanism is not enabled in the server, the server responds with the list of supported mechanisms and closes the client connection. If the mechanism is enabled in the server, the server sends a successful response and continues with SASL authentication.
클라이언트는 인증을 위해서 SASL 메카니즘을 포함한 카프카 SaslHandshakeRequest 을 보낸다.
만약 요청된 메카니즘이 서버에서 활성화 되어 있지 않다면, 서버는 지원하는 메카니즘 리스트를 응답하고, 클라이언트 연결을 끊는다.
만약 그 메카니즘이 서버에서 활성화 되어 있다면, 서버는 성공을 응답하고 SASL 인증을 계속 진행한다.
The actual SASL authentication is now performed. If SaslHandshakeRequest version is v0, a series of SASL client and server tokens corresponding to the mechanism are sent as opaque packets without wrapping the messages with Kafka protocol headers. If SaslHandshakeRequest version is v1, the SaslAuthenticate request/response are used, where the actual SASL tokens are wrapped in the Kafka protocol. The error code in the final message from the broker will indicate if authentication succeeded or failed.
이제 SASL 인증이 실제로 수행된다.
만약 SaslHandshakeRequest 버젼이 v0 이면, 어쩌구 저쩌구... ????????????????? 해석이 잘 안됨
만약 SaslHandshakeRequest 버젼이 v1 이면, SaslAuthentication 요청/응답이 사용되고, 실제 SASL 토큰은 카프카 프로토콜안에 감싸지게 된다.
브로커에서 온 최종 메세지 안에 있는 에러코드는 인증이 성공인지 실패인지 나타낸다.
If authentication succeeds, subsequent packets are handled as Kafka API requests. Otherwise, the client connection is closed.
인증이 성공하면, 후속 패킷들으 카프카 API 요청들로 처리된다. 그렇지 않으면 클라이언트 연결이 끊긴다.

For interoperability with 0.9.0.x clients, the first packet received by the server is handled as a SASL/GSSAPI client token if it is not a valid Kafka request. SASL/GSSAPI authentication is performed starting with this packet, skipping the first two steps above.
0.9.0.x 클라이언트와의 상호운용성을 위하여, 만약 유효한 카프카 요청이 아니라면,
서버에서 받은 첫번째 패킷은 SASL/GSSAPI 클라이언트 토큰으로 처리된다.
SASL/GSSAPI 인증은 이 패킷으로 동작이 시작되고, 위의 두 단계는 건너뛴다.

The Protocol

Protocol Primitive Types

The protocol is built out of the following primitive types.
프로토콜은 다음에 나오는 기본 타입들로 구축되었다.

TYPE	DESCRIPTION
BOOLEAN	Represents a boolean value in a byte. Values 0 and 1 are used to represent false and true respectively. When reading a boolean value, any non-zero value is considered true.
INT8	Represents an integer between -2⁷ and 2⁷-1 inclusive.
INT16	Represents an integer between -2¹⁵ and 2¹⁵-1 inclusive. The values are encoded using two bytes in network byte order (big-endian). 이 값은 2바이트이고, 네트워크 바이트 순서로 인코딩 되어 있다 (big-endian)
INT32	Represents an integer between -2³¹ and 2³¹-1 inclusive. The values are encoded using four bytes in network byte order (big-endian). 이 값은 4바이트이고, 네트워크 바이트 순서로 인코딩 되어 있다 (big-endian)
INT64	Represents an integer between -2⁶³ and 2⁶³-1 inclusive. The values are encoded using eight bytes in network byte order (big-endian). 이 값은 8바이트이고, 네트워크 바이트 순서로 인코딩 되어 있다 (big-endian)
UINT32	Represents an integer between 0 and 2³²-1 inclusive. The values are encoded using four bytes in network byte order (big-endian). 이 값은 4바이트이고, 네트워크 바이트 순서로 인코딩 되어 있다 (big-endian)
VARINT	Represents an integer between -2³¹ and 2³¹-1 inclusive. Encoding follows the variable-length zig-zag encoding from Google Protocol Buffers.
VARLONG	Represents an integer between -2⁶³ and 2⁶³-1 inclusive. Encoding follows the variable-length zig-zag encoding from Google Protocol Buffers.
UUID	Represents a type 4 immutable universally unique identifier (Uuid). The values are encoded using sixteen bytes in network byte order (big-endian).
FLOAT64	Represents a double-precision 64-bit format IEEE 754 value. The values are encoded using eight bytes in network byte order (big-endian).
STRING	Represents a sequence of characters. First the length N is given as an INT16. Then N bytes follow which are the UTF-8 encoding of the character sequence. Length must not be negative. 연속된 문자들을 표현한다. 먼저 길이 N 이 INT16 (2byte) 으로 나오고, 그 뒤에 UTF-8 로 인코딩된 연속 문자들이 이어진다. 길이는 음수가 될수 없다.
COMPACT_STRING	Represents a sequence of characters. First the length N + 1 is given as an UNSIGNED_VARINT . Then N bytes follow which are the UTF-8 encoding of the character sequence.
NULLABLE_STRING	Represents a sequence of characters or null. For non-null strings, first the length N is given as an INT16. Then N bytes follow which are the UTF-8 encoding of the character sequence. A null value is encoded with length of -1 and there are no following bytes.
COMPACT_NULLABLE_STRING	Represents a sequence of characters. First the length N + 1 is given as an UNSIGNED_VARINT . Then N bytes follow which are the UTF-8 encoding of the character sequence. A null string is represented with a length of 0.
BYTES	Represents a raw sequence of bytes. First the length N is given as an INT32. Then N bytes follow. 연속된 원시 바이트들을 표현한다. 먼저 길이 N 이 INT32 (4byte) 로 나오고, 그 뒤에 바이트들이 따라온다.
COMPACT_BYTES	Represents a raw sequence of bytes. First the length N+1 is given as an UNSIGNED_VARINT.Then N bytes follow.
NULLABLE_BYTES	Represents a raw sequence of bytes or null. For non-null values, first the length N is given as an INT32. Then N bytes follow. A null value is encoded with length of -1 and there are no following bytes.
COMPACT_NULLABLE_BYTES	Represents a raw sequence of bytes. First the length N+1 is given as an UNSIGNED_VARINT.Then N bytes follow. A null object is represented with a length of 0.
RECORDS	Represents a sequence of Kafka records as NULLABLE_BYTES. For a detailed description of records see Message Sets.
ARRAY	Represents a sequence of objects of a given type T. Type T can be either a primitive type (e.g. STRING) or a structure. First, the length N is given as an INT32. Then N instances of type T follow. A null array is represented with a length of -1. In protocol documentation an array of T instances is referred to as [T]. 타입 T 의 연속된 객체들을 표현한다. 타입 T는 기본 타입이거나 구조체가 될수 있다. 먼저 길시 N 이 INT32 (4byte) 로 나오고, 그 뒤에 타입 T 의 인스턴스들이 이어진다. null 배열은 길이가 -1로 표시된다. 프로토콜 문서에서 T 인스턴스들의 배열은 [T] 로 언급한다.
COMPACT_ARRAY	Represents a sequence of objects of a given type T. Type T can be either a primitive type (e.g. STRING) or a structure. First, the length N + 1 is given as an UNSIGNED_VARINT. Then N instances of type T follow. A null array is represented with a length of 0. In protocol documentation an array of T instances is referred to as [T].

Notes on reading the request format grammars

The BNFs below give an exact context free grammar for the request and response binary format. The BNF is intentionally not compact in order to give human-readable name. As always in a BNF a sequence of productions indicates concatenation. When there are multiple possible productions these are separated with '|' and may be enclosed in parenthesis for grouping. The top-level definition is always given first and subsequent sub-parts are indented.
여기는 해석을 못하겠다. ㅠㅠ

Common Request and Response Structure

All requests and responses originate from the following grammar which will be incrementally describe through the rest of this document:
모든 요청과 응답은 다음 문법에서 나오고, 이 문서의 나머지에서 계속 설명한다.

RequestOrResponse => Size (RequestMessage | ResponseMessage)
  Size => int32

FIELD

DESCRIPTION

message_size

The message_size field gives the size of the subsequent request or response message in bytes. The client can read requests by first reading this 4 byte size as an integer N, and then reading and parsing the subsequent N bytes of the request.
message_size 필드는 뒤에 나오는 요청이나 응답 메세지가 몇바이트 인지 알려준다.
클라이언트는 처음 이 4바이트를 정수 N으로 읽은 후, 뒤에 나오는 요청의 N바이트를 읽고 파싱한다.

Record Batch

A description of the record batch format can be found here.
레코드 배치 형식의 설명은 여기서 찾을수 있다.

Constants

Error Codes

We use numeric codes to indicate what problem occurred on the server. These can be translated by the client into exceptions or whatever the appropriate error handling mechanism in the client language. Here is a table of the error codes currently in use:

ERROR	CODE	RETRIABLE	DESCRIPTION
UNKNOWN_SERVER_ERROR	-1	False	The server experienced an unexpected error when processing the request.
NONE	0	False
OFFSET_OUT_OF_RANGE	1	False	The requested offset is not within the range of offsets maintained by the server.
CORRUPT_MESSAGE	2	True	This message has failed its CRC checksum, exceeds the valid size, has a null key for a compacted topic, or is otherwise corrupt.
UNKNOWN_TOPIC_OR_PARTITION	3	True	This server does not host this topic-partition.
INVALID_FETCH_SIZE	4	False	The requested fetch size is invalid.
LEADER_NOT_AVAILABLE	5	True	There is no leader for this topic-partition as we are in the middle of a leadership election.
NOT_LEADER_OR_FOLLOWER	6	True	For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.
REQUEST_TIMED_OUT	7	True	The request timed out.
BROKER_NOT_AVAILABLE	8	False	The broker is not available.
REPLICA_NOT_AVAILABLE	9	True	The replica is not available for the requested topic-partition. Produce/Fetch requests and other requests intended only for the leader or follower return NOT_LEADER_OR_FOLLOWER if the broker is not a replica of the topic-partition.
MESSAGE_TOO_LARGE	10	False	The request included a message larger than the max message size the server will accept.
STALE_CONTROLLER_EPOCH	11	False	The controller moved to another broker.
OFFSET_METADATA_TOO_LARGE	12	False	The metadata field of the offset request was too large.
NETWORK_EXCEPTION	13	True	The server disconnected before a response was received.
COORDINATOR_LOAD_IN_PROGRESS	14	True	The coordinator is loading and hence can't process requests.
COORDINATOR_NOT_AVAILABLE	15	True	The coordinator is not available.
NOT_COORDINATOR	16	True	This is not the correct coordinator.
INVALID_TOPIC_EXCEPTION	17	False	The request attempted to perform an operation on an invalid topic.
RECORD_LIST_TOO_LARGE	18	False	The request included message batch larger than the configured segment size on the server.
NOT_ENOUGH_REPLICAS	19	True	Messages are rejected since there are fewer in-sync replicas than required.
NOT_ENOUGH_REPLICAS_AFTER_APPEND	20	True	Messages are written to the log, but to fewer in-sync replicas than required.
INVALID_REQUIRED_ACKS	21	False	Produce request specified an invalid value for required acks.
ILLEGAL_GENERATION	22	False	Specified group generation id is not valid.
INCONSISTENT_GROUP_PROTOCOL	23	False	The group member's supported protocols are incompatible with those of existing members or first group member tried to join with empty protocol type or empty protocol list.
INVALID_GROUP_ID	24	False	The configured groupId is invalid.
UNKNOWN_MEMBER_ID	25	False	The coordinator is not aware of this member.
INVALID_SESSION_TIMEOUT	26	False	The session timeout is not within the range allowed by the broker (as configured by group.min.session.timeout.ms and group.max.session.timeout.ms).
REBALANCE_IN_PROGRESS	27	False	The group is rebalancing, so a rejoin is needed.
INVALID_COMMIT_OFFSET_SIZE	28	False	The committing offset data size is not valid.
TOPIC_AUTHORIZATION_FAILED	29	False	Topic authorization failed.
GROUP_AUTHORIZATION_FAILED	30	False	Group authorization failed.
CLUSTER_AUTHORIZATION_FAILED	31	False	Cluster authorization failed.
INVALID_TIMESTAMP	32	False	The timestamp of the message is out of acceptable range.
UNSUPPORTED_SASL_MECHANISM	33	False	The broker does not support the requested SASL mechanism.
ILLEGAL_SASL_STATE	34	False	Request is not valid given the current SASL state.
UNSUPPORTED_VERSION	35	False	The version of API is not supported.
TOPIC_ALREADY_EXISTS	36	False	Topic with this name already exists.
INVALID_PARTITIONS	37	False	Number of partitions is below 1.
INVALID_REPLICATION_FACTOR	38	False	Replication factor is below 1 or larger than the number of available brokers.
INVALID_REPLICA_ASSIGNMENT	39	False	Replica assignment is invalid.
INVALID_CONFIG	40	False	Configuration is invalid.
NOT_CONTROLLER	41	True	This is not the correct controller for this cluster.
INVALID_REQUEST	42	False	This most likely occurs because of a request being malformed by the client library or the message was sent to an incompatible broker. See the broker logs for more details.
UNSUPPORTED_FOR_MESSAGE_FORMAT	43	False	The message format version on the broker does not support the request.
POLICY_VIOLATION	44	False	Request parameters do not satisfy the configured policy.
OUT_OF_ORDER_SEQUENCE_NUMBER	45	False	The broker received an out of order sequence number.
DUPLICATE_SEQUENCE_NUMBER	46	False	The broker received a duplicate sequence number.
INVALID_PRODUCER_EPOCH	47	False	Producer attempted to produce with an old epoch.
INVALID_TXN_STATE	48	False	The producer attempted a transactional operation in an invalid state.
INVALID_PRODUCER_ID_MAPPING	49	False	The producer attempted to use a producer id which is not currently assigned to its transactional id.
INVALID_TRANSACTION_TIMEOUT	50	False	The transaction timeout is larger than the maximum value allowed by the broker (as configured by transaction.max.timeout.ms).
CONCURRENT_TRANSACTIONS	51	False	The producer attempted to update a transaction while another concurrent operation on the same transaction was ongoing.
TRANSACTION_COORDINATOR_FENCED	52	False	Indicates that the transaction coordinator sending a WriteTxnMarker is no longer the current coordinator for a given producer.
TRANSACTIONAL_ID_AUTHORIZATION_FAILED	53	False	Transactional Id authorization failed.
SECURITY_DISABLED	54	False	Security features are disabled.
OPERATION_NOT_ATTEMPTED	55	False	The broker did not attempt to execute this operation. This may happen for batched RPCs where some operations in the batch failed, causing the broker to respond without trying the rest.
KAFKA_STORAGE_ERROR	56	True	Disk error when trying to access log file on the disk.
LOG_DIR_NOT_FOUND	57	False	The user-specified log directory is not found in the broker config.
SASL_AUTHENTICATION_FAILED	58	False	SASL Authentication failed.
UNKNOWN_PRODUCER_ID	59	False	This exception is raised by the broker if it could not locate the producer metadata associated with the producerId in question. This could happen if, for instance, the producer's records were deleted because their retention time had elapsed. Once the last records of the producerId are removed, the producer's metadata is removed from the broker, and future appends by the producer will return this exception.
REASSIGNMENT_IN_PROGRESS	60	False	A partition reassignment is in progress.
DELEGATION_TOKEN_AUTH_DISABLED	61	False	Delegation Token feature is not enabled.
DELEGATION_TOKEN_NOT_FOUND	62	False	Delegation Token is not found on server.
DELEGATION_TOKEN_OWNER_MISMATCH	63	False	Specified Principal is not valid Owner/Renewer.
DELEGATION_TOKEN_REQUEST_NOT_ALLOWED	64	False	Delegation Token requests are not allowed on PLAINTEXT/1-way SSL channels and on delegation token authenticated channels.
DELEGATION_TOKEN_AUTHORIZATION_FAILED	65	False	Delegation Token authorization failed.
DELEGATION_TOKEN_EXPIRED	66	False	Delegation Token is expired.
INVALID_PRINCIPAL_TYPE	67	False	Supplied principalType is not supported.
NON_EMPTY_GROUP	68	False	The group is not empty.
GROUP_ID_NOT_FOUND	69	False	The group id does not exist.
FETCH_SESSION_ID_NOT_FOUND	70	True	The fetch session ID was not found.
INVALID_FETCH_SESSION_EPOCH	71	True	The fetch session epoch is invalid.
LISTENER_NOT_FOUND	72	True	There is no listener on the leader broker that matches the listener on which metadata request was processed.
TOPIC_DELETION_DISABLED	73	False	Topic deletion is disabled.
FENCED_LEADER_EPOCH	74	True	The leader epoch in the request is older than the epoch on the broker.
UNKNOWN_LEADER_EPOCH	75	True	The leader epoch in the request is newer than the epoch on the broker.
UNSUPPORTED_COMPRESSION_TYPE	76	False	The requesting client does not support the compression type of given partition.
STALE_BROKER_EPOCH	77	False	Broker epoch has changed.
OFFSET_NOT_AVAILABLE	78	True	The leader high watermark has not caught up from a recent leader election so the offsets cannot be guaranteed to be monotonically increasing.
MEMBER_ID_REQUIRED	79	False	The group member needs to have a valid member id before actually entering a consumer group.
PREFERRED_LEADER_NOT_AVAILABLE	80	True	The preferred leader was not available.
GROUP_MAX_SIZE_REACHED	81	False	The consumer group has reached its max size.
FENCED_INSTANCE_ID	82	False	The broker rejected this static consumer since another consumer with the same group.instance.id has registered with a different member.id.
ELIGIBLE_LEADERS_NOT_AVAILABLE	83	True	Eligible topic partition leaders are not available.
ELECTION_NOT_NEEDED	84	True	Leader election not needed for topic partition.
NO_REASSIGNMENT_IN_PROGRESS	85	False	No partition reassignment is in progress.
GROUP_SUBSCRIBED_TO_TOPIC	86	False	Deleting offsets of a topic is forbidden while the consumer group is actively subscribed to it.
INVALID_RECORD	87	False	This record has failed the validation on broker and hence will be rejected.
UNSTABLE_OFFSET_COMMIT	88	True	There are unstable offsets that need to be cleared.
THROTTLING_QUOTA_EXCEEDED	89	True	The throttling quota has been exceeded.
PRODUCER_FENCED	90	False	There is a newer producer with the same transactionalId which fences the current one.
RESOURCE_NOT_FOUND	91	False	A request illegally referred to a resource that does not exist.
DUPLICATE_RESOURCE	92	False	A request illegally referred to the same resource twice.
UNACCEPTABLE_CREDENTIAL	93	False	Requested credential would not meet criteria for acceptability.
INCONSISTENT_VOTER_SET	94	False	Indicates that the either the sender or recipient of a voter-only request is not one of the expected voters
INVALID_UPDATE_VERSION	95	False	The given update version was invalid.
FEATURE_UPDATE_FAILED	96	False	Unable to update finalized features due to an unexpected server error.
PRINCIPAL_DESERIALIZATION_FAILURE	97	False	Request principal deserialization failed during forwarding. This indicates an internal error on the broker cluster security setup.
SNAPSHOT_NOT_FOUND	98	False	Requested snapshot was not found
POSITION_OUT_OF_RANGE	99	False	Requested position is not greater than or equal to zero, and less than the size of the snapshot.
UNKNOWN_TOPIC_ID	100	True	This server does not host this topic ID.
DUPLICATE_BROKER_REGISTRATION	101	False	This broker ID is already in use.
BROKER_ID_NOT_REGISTERED	102	False	The given broker ID was not registered.
INCONSISTENT_TOPIC_ID	103	True	The log's topic ID did not match the topic ID in the request
INCONSISTENT_CLUSTER_ID	104	False	The clusterId in the request does not match that found on the server

Api Keys

'잡다한 자료' 카테고리의 다른 글

[ElasticSearch] 시간대별 문서건수 집계 (0)	2024.02.01
Redis Protocol specification (0)	2021.05.25
Redis. Keyspace Notifications (0)	2021.02.16
GitHub. Two-Factor Auth. SourceTree. ssh 연결 (0)	2020.12.22
aws. CloudWatch -> Lambda -> api 호출 (0)	2020.06.23

Posted by 돌비

컴퓨터 초보의 개발자료