導入
機械学習の準備として,まずルービックキューブを定式化する.そのために,Python で Cube
クラスを作る.Cube
では,次のプロパティとメソッドを想定する.
- ルービックキューブの状態を表すデータを保存する変数 states
- ルービックキューブの状態を初期化するための関数 reset
- 動きを受理して状態の遷移を定義する関数 move
- AI に使いやすい形の one-hot code を出力するための関数 one_hot
ルービックキューブのエンジニアリング
定数
Cube は Front
, Back
, Left
, Right
, Up
, Down
の6面を持っている.それぞれの面をその頭文字で表す.
FACES = { 'U': 0, # Up 'D': 1, # Down 'F': 2, # Front 'B': 3, # Back 'L': 4, # Left 'R': 5, # Right } FACES_ = {v: k for k, v in FACES.items()}
動きに関しては,6 面それぞれ時計回りと反時計回りで動けるので,計12パターンがある.面の大文字で時計回り,小文字で反時計回りで表すと,次のようになる.
ACTIONS = { "U", "u'", "D", "d'", "F", "f", "B", "b", "L", "l", "R", "r'", }
色に関しては,世界標準配色を採用したうえ,初期状態の面と同じコードを振る.
COLORS = { 'Y': 0, # Yellow 'W': 1, # White 'R': 2, # Red 'O': 3, # Orange 'B': 4, # Blue 'G': 5, # Green } COLORS_ = {v: k for k, v in COLORS.items()}
Cube は 26 ピースの小さい Cube がある.それらを cubelet(s) と呼ぶ.そのうち,六面に中央 cubelet 各1つ,コーナーにある cubelets 8 個,エージにある cubelets 12 個.中央 cubelets は動けないため,考察の対象外にする.コーナー(エージー)の cubelets は,初期状態でどの3つ(2つ)の面にあるのかによって,それらの面の3(2)文字で表す.例えば,UFR
は 上,前,右にある cubelet を表す.
CUBELETS = { # コーナー 'UFR': 0, 'URB': 1, 'UBL': 2, 'ULF': 3, 'DFL': 4, 'DLB': 5, 'DBR': 6, 'DRF': 7, # エッジ 'UF': 8, 'UR': 9, 'UB': 10, 'UL': 11, 'FR': 12, 'FL': 13, 'BR': 14, 'BL': 15, 'DF': 16, 'DR': 17, 'DB': 18, 'DL': 19, } CUBELETS_ = {v: k for k, v in CUBELETS.items()}
最後に,便利のためあとで使う INDICES_24
を定義しておく.
INDICES_24 = [i for i in range(24)]
Cube クラス
状態と One-Hot Code
まず,cubelet と slot(枠)という2つの概念を分けて考えなければならない.UFR
という slot といえば,上前右の位置にある cubelet を収めるための仮想の枠の意味で使う.UFR
という cubelet とえば,揃えた状態で UFR
slot にある cubelet (すなわち黄色,赤,緑の三色の cubelet) のことを指す.
状態を特定するために,2通りの方法がある.
1. ある cubelet に対し,どの slot にあるのかを記録する
2. ある slot に対し,どの cubelet を収めているのかを記録する
後で述べる move
の定義の便利上,ここで 2 の方式で定義する.中央 cubelets を除いたものの位置と向きを特定すれば良い.
コーナー cubelets は8個x3つの向きで,エージ cubelets は12個x2つの向きである.よって,どの cubelet に対しても,24種類の状態しかない.よって,AIへ渡す One-Hot Code のサイズは 20x24 で良い.一方,計算の都合上,状態の定義は 20x2 のテンサーにする.axis 0 の $i (0 \le i < 20)$ 次元は,前述 CUBELETS
という辞書で定義した対応する枠を指す.axis 1 の第 0 次元はどの cubelet かを指し,コーナーの場合は 0 ~ 7,エージの場合は 0 ~ 11 の番号が指定できる.axis 1 の第 1 次元はどの向きを向いているかを指す.コーナーの場合,揃えた状態で上もしくは下を向く sticker (cubelet の面)が上もしくは下を向くとき,値を 0 とする.この状態から観察者からして逆時計回りで 120 度,240 度回した場合,それぞれ 1 と 2 で表す.エージの場合,六面を U = D > F = B > L = R という優先順位を付け,揃えた状態で高優先順位 sticker がまた高優先順位である場合は 0,そうでない場合は 1 とする.
実装すると,次のようになる.
class Cube: """ Rubik's Cube クラス :arg states: 20 x 2 ndarray, (location, phase). Cubelet slot states. For example, `self.states[CUBELETS['UFR']]` stand for the up-front-right corner cubelet slot, and the yellow-red-green corner cubelet should be in it for a solved cube. See also `CUBELETS`. For index (i, j), 0 <= i < 20, 0 <= j < 3. If i 0 <= i < 8, it refers to a corner cubelet slot. For a solved cube, the default value of (i, 0) should be i, and (i, 1) should be 0. (i, 0) can be any value between 0 and 7, indicating which corner cubelet in the slot i. (i, 1) can be 0, 1, or 2, corresponding to 0, 120, or 240 degree the cubelet rotated counterclockwise, where the 0 degree indicate the white or yellow stick of the cubelet facing up or down. Let m, n be (states[i, 0], states[i, 1]), then the one-hot value can be expressed by m + n * 8. If 8 <= i < 20, it refers to an edge cubelet slot. For a solved cube, the default value of (i, 0) should be i - 8, and (i, 1) should be 0. (i, 0) can be any value between 0 and 11, indicating which edge cubelet in the slot i. (i, 1) can be 0 or 1, corresponding to 0 or 180 degree. Let m, n be (states[i, 0], states[i, 1]), then the one-hot value can be expressed by m + n * 12. """ def __init__(self): self.states = np.empty((20, 2), dtype=np.unit8) self.reset() def reset(self): self.states[:, 0] = np.arange(20, dtype=np.uint8) def one_hot(self) -> np.ndarray: states = np.empty(20, dtype=np.uint8) states[:8] = self.states[:8, 0] + self.states[:8, 1] * 8 states[8:] = self.states[8:, 0] + self.states[8:, 1] * 12 one_hot = np.zeros((20, 24), dtype=np.uint8) one_hot[INDICES_24, states] = 1 return one_hot
移動
次に,移動による状態遷移を定義する.
def move(self, action): match action: case "F": # Corner ufr_0, ufr_1 = self.states[CUBELETS['UFR']] ulf_0, ulf_1 = self.states[CUBELETS['ULF']] dfl_0, dfl_1 = self.states[CUBELETS['DFL']] drf_0, drf_1 = self.states[CUBELETS['DRF']] self.states[CUBELETS['UFR']] = ulf_0, (ulf_1 + 2) % 3 self.states[CUBELETS['ULF']] = dfl_0, (dfl_1 + 1) % 3 self.states[CUBELETS['DFL']] = drf_0, (drf_1 + 2) % 3 self.states[CUBELETS['DRF']] = ufr_0, (ufr_1 + 1) % 3 # Edge a, b, c, d = CUBELETS['UF'], CUBELETS['FL'], CUBELETS['DF'], CUBELETS['FR'] _states = self.states[[b, c, d, a]] _states[:, 1] = (_states[:, 1] + 1) % 2 self.states[[a, b, c, d]] = _states case "f": # Corner ufr_0, ufr_1 = self.states[CUBELETS['UFR']] ulf_0, ulf_1 = self.states[CUBELETS['ULF']] dfl_0, dfl_1 = self.states[CUBELETS['DFL']] drf_0, drf_1 = self.states[CUBELETS['DRF']] self.states[CUBELETS['UFR']] = drf_0, (drf_1 + 2) % 3 self.states[CUBELETS['ULF']] = ufr_0, (ufr_1 + 1) % 3 self.states[CUBELETS['DFL']] = ulf_0, (ulf_1 + 2) % 3 self.states[CUBELETS['DRF']] = dfl_0, (dfl_1 + 1) % 3 # Edge a, b, c, d = CUBELETS['UF'], CUBELETS['FL'], CUBELETS['DF'], CUBELETS['FR'] _states = self.states[[d, a, b, c]] _states[:, 1] = (_states[:, 1] + 1) % 2 self.states[[a, b, c, d]] = _states case "B": # Corner urb_0, urb_1 = self.states[CUBELETS['URB']] dbr_0, dbr_1 = self.states[CUBELETS['DBR']] dlb_0, dlb_1 = self.states[CUBELETS['DLB']] ubl_0, ubl_1 = self.states[CUBELETS['UBL']] self.states[CUBELETS['URB']] = dbr_0, (dbr_1 + 1) % 3 self.states[CUBELETS['DBR']] = dlb_0, (dlb_1 + 2) % 3 self.states[CUBELETS['DLB']] = ubl_0, (ubl_1 + 1) % 3 self.states[CUBELETS['UBL']] = urb_0, (urb_1 + 2) % 3 # Edge a, b, c, d = CUBELETS['UB'], CUBELETS['BR'], CUBELETS['DB'], CUBELETS['BL'] _states = self.states[[b, c, d, a]] _states[:, 1] = (_states[:, 1] + 1) % 2 self.states[[a, b, c, d]] = _states case "b": # Corner urb_0, urb_1 = self.states[CUBELETS['URB']] dbr_0, dbr_1 = self.states[CUBELETS['DBR']] dlb_0, dlb_1 = self.states[CUBELETS['DLB']] ubl_0, ubl_1 = self.states[CUBELETS['UBL']] self.states[CUBELETS['URB']] = ubl_0, (ubl_1 + 1) % 3 self.states[CUBELETS['DBR']] = urb_0, (urb_1 + 2) % 3 self.states[CUBELETS['DLB']] = dbr_0, (dbr_1 + 1) % 3 self.states[CUBELETS['UBL']] = dlb_0, (dlb_1 + 2) % 3 # Edge a, b, c, d = CUBELETS['UB'], CUBELETS['BR'], CUBELETS['DB'], CUBELETS['BL'] _states = self.states[[d, a, b, c]] _states[:, 1] = (_states[:, 1] + 1) % 2 self.states[[a, b, c, d]] = _states case "L": # Corner ulf_0, ulf_1 = self.states[CUBELETS['ULF']] ubl_0, ubl_1 = self.states[CUBELETS['UBL']] dlb_0, dlb_1 = self.states[CUBELETS['DLB']] dfl_0, dfl_1 = self.states[CUBELETS['DFL']] self.states[CUBELETS['ULF']] = ubl_0, (ubl_1 + 2) % 3 self.states[CUBELETS['UBL']] = dlb_0, (dlb_1 + 1) % 3 self.states[CUBELETS['DLB']] = dfl_0, (dfl_1 + 2) % 3 self.states[CUBELETS['DFL']] = ulf_0, (ulf_1 + 1) % 3 # Edge a, b, c, d = CUBELETS['UL'], CUBELETS['BL'], CUBELETS['DL'], CUBELETS['FL'] self.states[[a, b, c, d]] = self.states[[b, c, d, a]] case "l": # Corner ulf_0, ulf_1 = self.states[CUBELETS['ULF']] ubl_0, ubl_1 = self.states[CUBELETS['UBL']] dlb_0, dlb_1 = self.states[CUBELETS['DLB']] dfl_0, dfl_1 = self.states[CUBELETS['DFL']] self.states[CUBELETS['ULF']] = dfl_0, (dfl_1 + 2) % 3 self.states[CUBELETS['UBL']] = ulf_0, (ulf_1 + 1) % 3 self.states[CUBELETS['DLB']] = ubl_0, (ubl_1 + 2) % 3 self.states[CUBELETS['DFL']] = dlb_0, (dlb_1 + 1) % 3 # Edge a, b, c, d = CUBELETS['UL'], CUBELETS['BL'], CUBELETS['DL'], CUBELETS['FL'] self.states[[a, b, c, d]] = self.states[[d, a, b, c]] case "R": # Corner urb_0, urb_1 = self.states[CUBELETS['URB']] ufr_0, ufr_1 = self.states[CUBELETS['UFR']] drf_0, drf_1 = self.states[CUBELETS['DRF']] dbr_0, dbr_1 = self.states[CUBELETS['DBR']] self.states[CUBELETS['URB']] = ufr_0, (ufr_1 + 2) % 3 self.states[CUBELETS['UFR']] = drf_0, (drf_1 + 1) % 3 self.states[CUBELETS['DRF']] = dbr_0, (dbr_1 + 2) % 3 self.states[CUBELETS['DBR']] = urb_0, (urb_1 + 1) % 3 # Edge a, b, c, d = CUBELETS['UR'], CUBELETS['FR'], CUBELETS['DR'], CUBELETS['BR'] self.states[[a, b, c, d]] = self.states[[b, c, d, a]] case "r": # Corner urb_0, urb_1 = self.states[CUBELETS['URB']] ufr_0, ufr_1 = self.states[CUBELETS['UFR']] drf_0, drf_1 = self.states[CUBELETS['DRF']] dbr_0, dbr_1 = self.states[CUBELETS['DBR']] self.states[CUBELETS['URB']] = dbr_0, (dbr_1 + 2) % 3 self.states[CUBELETS['UFR']] = urb_0, (urb_1 + 1) % 3 self.states[CUBELETS['DRF']] = ufr_0, (ufr_1 + 2) % 3 self.states[CUBELETS['DBR']] = drf_0, (drf_1 + 1) % 3 # Edge a, b, c, d = CUBELETS['UR'], CUBELETS['FR'], CUBELETS['DR'], CUBELETS['BR'] self.states[[a, b, c, d]] = self.states[[d, a, b, c]] case "U": # Corner a, b, c, d = CUBELETS['URB'], CUBELETS['UBL'], CUBELETS['ULF'], CUBELETS['UFR'] self.states[[a, b, c, d]] = self.states[[b, c, d, a]] # Edge a, b, c, d = CUBELETS['UB'], CUBELETS['UL'], CUBELETS['UF'], CUBELETS['UR'] self.states[[a, b, c, d]] = self.states[[b, c, d, a]] case "u": # Corner a, b, c, d = CUBELETS['URB'], CUBELETS['UBL'], CUBELETS['ULF'], CUBELETS['UFR'] self.states[[a, b, c, d]] = self.states[[d, a, b, c]] # Edge a, b, c, d = CUBELETS['UB'], CUBELETS['UL'], CUBELETS['UF'], CUBELETS['UR'] self.states[[a, b, c, d]] = self.states[[d, a, b, c]] case "D": # Corner a, b, c, d = CUBELETS['DRF'], CUBELETS['DFL'], CUBELETS['DLB'], CUBELETS['DBR'] self.states[[a, b, c, d]] = self.states[[b, c, d, a]] # Edge a, b, c, d = CUBELETS['DF'], CUBELETS['DL'], CUBELETS['DB'], CUBELETS['DR'] self.states[[a, b, c, d]] = self.states[[b, c, d, a]] case "d": # Corner a, b, c, d = CUBELETS['DRF'], CUBELETS['DFL'], CUBELETS['DLB'], CUBELETS['DBR'] self.states[[a, b, c, d]] = self.states[[d, a, b, c]] # Edge a, b, c, d = CUBELETS['DF'], CUBELETS['DL'], CUBELETS['DB'], CUBELETS['DR'] self.states[[a, b, c, d]] = self.states[[d, a, b, c]] case _: raise ValueError("ACTION UNKNOWN")
次の文で,Unity でルービックキューブを実装し,ここで定義している Cube
クラスの正当性を確認するために目視でユニットテストを行う.